Re: [Wesnoth-dev] investigating whether wesnoth is suitable for my AI thesis

Roald (ubuntu_demon) Fri, 01 Jun 2007 02:46:44 -0700

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

John McNabb wrote:
> Welcome to wesnoth.


Thanks :)
> 
> I am not much of a python person, so I will let those who are more
> closely involved with the python bindings answer those questions.  

Could you please point me to some people who work on the python bindings
? I might be willing to help on the parts which are of interest to my
research.

> I
> have, however, been working on a new C++ AI that will add a lot of
> flexibility via WML, and I have been thinking about some of the issues
> you have raised involving running the AI from the command line and
> training it, so I will try and answer those as best I can..
> 

To be more clear this might be a good approach from an AI research
perspective :

phase 1) offline machine learning :
- - to try whether it's possible to find map specific things
- - to try to find a good state space representation

phase 2) offline reinforcement learning (or some other learning
technique) to try to find nice bootstrapped weights :
- - probably map specific
- - maybe considering type of faction(s) of opponent(s)

phase 3) online learning : running the AI inside a normal game. the AI
should continue to learn and adapt.

When I say "offline learning" I mean running a script which runs wesnoth
from command prompt without worrying about human players. When I say
online learning I mean it should continue to learn against human players
and it should be fast.

>> * I have to figure out some practical stuff and I'm hoping you guys can
>> help me shed some light on them.
> 
> we will certainly try...

thanks :)

> 
>> I read a little bit about restricting python. I do like
>> reviewing/renaming solutions but personally I don't like technical
>> restrictions. Python being powerful/versatile makes it a nice language
>> for doing research with (you can hook into all kinds of existing stuff,
>> you can quickly prototype something). IMHO you probably can't restrict
>> any c++ AI so IMHO considering restricting python AI's but not c++ AI's
>> seems inconsistent. Also it's not like there are lots of python AI's
>> available for wesnoth so the reviewing/renaming solution probably
>> doesn't add much to the workload of the volunteers who do this.
> 
> The difference between the C++ AI and the python AI is that the C++ is
> essentially restricted at compile time.  If the program is well
> behaved when it is distributed, there is no particular problem.  Since
> the python AI is interpreted, however, it is conceivable that someone
> could execute malicious code unwittingly.  Ok, so the individual would
> have had to have downloaded the campaign/whatever that had the
> malicious python in it, but what it comes down to is that the user
> made content server is not policed the way that the C++ code is.
> Anyone can upload content to the campaign server and malicious code
> might reside there for quite some time before anyone noticed it,
> especially if it was careful to hide its tracks.  That said, I don't
> really know much about the plans for restricting python.

Having a volunteer having to approve a python script still sounds like
the best solution to me.

I would be very interested to learn more about the "restricting python
plans" from someone who is involved with that. Maybe allefant is the guy
to talk to about that ? What is his email ?

> 
>>  * I think it's possible for games to be played within a few minutes. I
>> did some testing with default AI's which confirms this (starting from
>> command line games take roughly 10-80 seconds). My research AI will
>> probably be a bit more time consuming than this default AI but I think
>> I'm safe here.
> 
> This certainly should be possible.

Thought so :)

> 
>>  * As I understand it each turn the AI is re-initialized. Assuming
>> set_variable() and get_variable() will only work for variables and not
>> for (more complex) objects,dictionaries,lists,lists of lists (matrices)
>> this will mean I have to pickle my objects and matrices. Assuming that
>> importing cPickle is allowed this will probably mean an overhead in the
>> order of seconds per game. (IMHO it would be nicer if you force python
>> AI's to implement a save and load function but that approach might have
>> the drawback of a slightly steeper learning curve)
> 
> One thing that I implemented in the AI for the C++ is a memory that is
> saved from turn to turn.  Basically any WML construct can be saved for
> use in future turns.  The memory is split by team so that each AI can
> remember different things.  I don't know if the python has any
> bindings to use this yet, but I think that it would be good if they
> did.
> 

IMHO cPickle is something I can probably use right now.

>> So here's a list of practical stuff that's not yet clear to me :
>>
>>  * if python will be restricted somehow in the future will I be able to
>> turn this of by compiling wesnoth with --unrestricted-python ?
> 
> That seems like a reasonable request, but it will depend on how
> difficult it is to code the python restrictions.

If someone can tell me whether such a compile switch will be implemented
I won't have to be afraid my stuff suddenly stops working when a new
development version is released (if I chose to work with the development
branch which I'm inclined to). This is important to me.

> 
>>  * Do you think I should run the 1.3 development branch of wesnoth or
>> the 1.2 stable branch of wesnoth ? Why ?
> 
> Well, it will depend on exactly what you want to do.  There are
> definitely several nice features that have been added since 1.2, but
> on the other hand, several things are in a pretty major state of flux
> right now in 1.3.   Looking through the change log since 1.2 for
> python related things results in this list:
> [code]
>  * increased required version of python from >=2.3 to >=2.4
>    * fixed detection of installed python versions to work on systems that do
>      not have python installed at /usr/ (like MacOSX using fink)
>  * Python AI
>    * Added various input validations
>    * Set Python errors upon error
>    * added support for optipng optimization in the compilation process
> [/code]
> 

- From the change log it doesn't sound as much has changed regarding the
python API.

> Personally, if I were you, I would use the latest development version
> at the time you actually start doing serious coding.  It will probably
> be easier to get devs to make any changes that are needed by you on
> the development branch then on the stable branch.  

Good argument. It probably would be easier to get devs to make any
changes that are needed. And hope if I would make any change myself it
would have a better chance at getting accepted if it's actually nice
code (though I'm no c++ guy).

> But then again, I'm
> a bleeding edge type of guy.
> 
>>  * Is there some way for the python AI to know when the game is finished
>> ? It's really important to have something like this because without this
>> it makes it really hard to make your AI actually learn.
>> I would like to know at which place I (the current python AI) is
>> finished and I would need to know how much players were in the game
>> (finishing 2nd out of 2 players is bad whereas finishing 2nd out of 8
>> players is quite good). Most learning probably happens after a game is
>> finished.
> 
> I don't believe that this is possible the way you have described it.

For "phase 3 online learning" it is essential to have this. It is
essential if you want to have an AI which is able to continue to adapt
and learn while playing against humans.

- From a research perspective "phase 1" and "phase 2" are way more
important. But from a user's perspective it becomes more fun to play
against an AI which continues to learn and adapt.

IMHO it's possible to trow in automatic difficulty scaling to make sure
playing against this AI is fun for everyone (not just those that are
playing equal or better than this AI). This is probably out of scope of
my research but might be fun to try afterwards.

If there is "finishing stuff" available it becomes easier to write an AI
which learns. This might attract more people to write AI's for wesnoth.

Can someone point me to relevant portions of the code (python bindings
stuff) to modify to make this work ? If it's not too hard to do I might
give it a shot myself.

> What can be done with very little modification, however, is to make
> sure that the final state is saved at the end of the game 

Where in the code should this modification be made ?

> and then
> whatever script you are using to run wesnoth from the command line
> could analyze the save game to determine how to modify the AI
> parameters.

Thanks for this suggestion. This is probably sufficient for "phase 1"
and "phase 2". But it isn't suitable for "phase 3".

Where can I find information about the savegame format ? Is it easy to
parse ?

> 
>>  * To be specific : I would like to have the following "finishing stuff"
>> : knowing whether or not the game is finished, knowing which place I
>> finished at (knowing the place of all players would be nice), knowing
>> whether the game is started from command line(all time available) or gui
>> (less time available either enforced or just make sure not to use too
>> much time),being able to do some computation when the game is finished
> 
> Again, I think the simplest way to do this for your research is to
> have a script running wesnoth in command line mode and analyzing the
> save game outside of wesnoth itself.

It's not an elegant solution but it probably will work.

> 
>>  * Consider the following 2-player example if I would use some basic
>> reinforcement learning technique (such as q-learning) I can give a
>> positive reward to all my actions when I'm finished at first place and a
>> negative reward for all my actions for finishing 2nd place. Giving out
>> this reward probably isn't computationally heavy but it needs to happen
>> after the game is finished.
> 
> see above.
> 
>>  * Consider another (2 player) example. My python AI plays against the
>> default AI but after the game is finished I apply some Machine
>> Learning/Statistical technique to find interesting places on the map.
>> This can actually be computationally heavy if needed because ideally it
>> only needs to be done once for each map and it can be done offline by
>> me. The stuff you learn here can be used "online" in the map specific
>> part of my AI.
> 
> still consistant with doing the "learning" portion outside of wesnoth proper.
> 
>>  * When running wesnoth from GUI python AI's should make sure not to use
>> (too much) noticeable time.
> 
> Yes, but I wouldn't worry about this too much.  If you are planning on
> running a lot of iterations in order for the AI to learn, you are
> going to want it to be pretty fast to begin with.

True. It's probably not a problem.

> 
>>  * When running wesnoth from command line the allowed time for
>> computation after having finished a game isn't important because human
>> players don't see this. Only some AI developers would use it.
> 
> Again, if you do this from an external script, this is not a problem.
> 
>>  * If this "finishing information" doesn't exist can this be included in
>> the next development version of Wesnoth ?
> 
> I think it would be quite possible to add in an end of game '-save
> filename' argument for command line use.  I have pondered adding it
> myself previously but have put it off until my own AI actually needs
> it.  If it turns out that this is something that you need, I could
> push it up on my priority list.  I don't really know about adding in
> an end of game python AI analysis call.

Regarding the "finishing information" I currently see three options :

1)
 * work on python bindings. most elegant solution. no idea how hard this
is.

2)
 * adapting the information wesnoth throws out when running from the
commandline (currently shows only who wins). (no idea how hard this is)

What I would have to do in my python AI (easy) :

 * make my python AI use 1) state-action matrix for the current game 2)
state-action matrix with the actual learned weights

What I would have to do in my external script (probably easy):
 * using an external script to parse the command line output and do the
actual learning => adapt the second matrix based on the result of the
game and the contents of the first matrix

note : in reality there might be more matrices if I consider maps, sides
and game phases but I wanted to keep the explanation simple.

3)
 * Adding a commandline option to make it possible to save games.
 * Adding a data entry to the savegames for "finishing information"
which can be parsed by the external script OR make the external script
be able to parse savegames if there are entries like "player 1 died"
 * using an external script to do the actual learning (either by
learning everything from the savegames or by doing it with two matrices
like 2.

> 
> Good luck.
> John (aka Darth Fool).

Thank you very much for your quick reply.

Regards,

Roald (ubuntu_demon)


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGX+qtadqpfxv/6LsRAspAAJ4zearxKhXE6cu7HfSuIILM9YfmZACghxDW
GxKsM+1he6wZEvCinkzSy0A=
=G/7s
-----END PGP SIGNATURE-----

_______________________________________________
Wesnoth-dev mailing list
[email protected]
https://mail.gna.org/listinfo/wesnoth-dev

Re: [Wesnoth-dev] investigating whether wesnoth is suitable for my AI thesis

Reply via email to