Hello Rohan, > The competition was part of the coursework. It was a classification problem. > The > data we were given had 2500 features, 10,000 samples and 20 classes that we > had > to classify it into. This was a hand engineered dataset by our teaching > assistants.The data also had missing information. Some of the competition > winning algorithms like Xgboost failed to perform well. I won the competition > by > using SVM and Neural networks.
Wow, I expected that Xgboost would provide some reasonably good results. > However, using RL for trading has its own advantages. It can learn directly > from > a simulation environment and will be able to adapt to the latency of the > environment because it will be receiving negative reward during the latency > period. This is however not possible by using deep learning techniques which > cannot work around this latency. Also complex policies can be learn by the > agent > using deep learning techniques that can't be learned by humans. Thanks for the informations, sounds like you already put some time into the idea, I'll see if I can take a closer look at the papers in the next days. > Sent with MailTrack Tracking who is reading your emails without their consent is unethical. Please consider not using software like this. Thanks, Marcus > On 25. Feb 2018, at 06:12, ROHAN SAPHAL <[email protected]> wrote: > > > Hi Marcus, > > Sorry for the delayed reply. I was traveling the past two days. > > That sounds really cool, what kind of competition was that? > The competition was part of the coursework. It was a classification problem. > The data we were given had 2500 features, 10,000 samples and 20 classes that > we had to classify it into. This was a hand engineered dataset by our > teaching assistants.The data also had missing information. Some of the > competition winning algorithms like Xgboost failed to perform well. I won the > competition by using SVM and Neural networks. > > The idea sounds interesting, do you have some particular methods/papers in > mind > you like to work on since the methods listed on the ideas page are just > suggestions this is could be a GSoC project. > > The two most frequently seen algorithms that have been used is Q-learning and > recurrent reinforcement learning. Firstly, I want to mention the challenges > in the trading domain, > Environment: The trading system is a POMDP and consists of multiple other > trading agents. We can have two approaches here, we can assume other agents > to be part of the environment and then allow the agent to learn in that > environment. Otherwise, we could take a multi-agent approach where we try to > reverse-engineer the trading strategies of other agents and then learn to > exploit them. This moves into the multi-agent RL domain which is currently > an ongoing research field. > Action spaces: We can have our agent take discrete action space, which is > basically 3 cations of buy, hold or sell. Increasing the complexity, we can > have the agent have the amount to be invested which is a continous action > space. Further we could also have the agent know when to place the orders and > in how much quantity which makes it much more complex. Having this level of > complexity is required if needed to make profits on a regular basis. > Reward function: Although it may seem intuitive to feed the profit/loss as > the reward function, it may not be a good idea as this reward is sparse and > is not very frequent. We could also feed unrealized profit/loss also which is > not so sparse but allows the agent to lean to trade profitably. This however > causes bias towards the agent when an actual profit/loss is made. The other > possibility is to choose a reward function that tends to reduce the risk > involved like sharpe ratio or maximum drawdown. We might have to choose > multiple reward functions to trade off between profit and risk. > However, using RL for trading has its own advantages. It can learn directly > from a simulation environment and will be able to adapt to the latency of the > environment because it will be receiving negative reward during the latency > period. This is however not possible by using deep learning techniques which > cannot work around this latency. Also complex policies can be learn by the > agent using deep learning techniques that can't be learned by humans. > > I feel a good starting point would be to implement state of the art recurrent > reinforcement learning algorithm and then improve on it by incorporating > multiple agents, continuous action spaces,etc. Hoping to hear suggestions > from mentors. > > PFA some relevant papers. > > > > Regards, > > Rohan Saphal > > > > > > > <https://mailtrack.io/> Sent with Mailtrack > <https://chrome.google.com/webstore/detail/mailtrack-for-gmail-inbox/ndnaehgpjlnokgebbaldlmgkapkpjkkb?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality> > > On Tue, Feb 20, 2018 at 11:35 PM, ROHAN SAPHAL <[email protected] > <mailto:[email protected]>> wrote: > Hi, > > I am Rohan Saphal, a pre-final year undergraduate from Indian Institute of > Technology Madras. > > My research interest is in Artificial Intelligence and specifically in Deep > reinforcement learning. > I have been working with Prof. Balaraman Ravindran > <https://scholar.google.co.in/citations?user=nGUcGrYAAAAJ&hl=en> in > Multi-agent reinforcement learning and will continue to do my final degree > thesis project under his guidance. > I am currently a graduate research intern at Intel labs working on > Reinforcement learning. > Previously, I was a computer vision intern at Caterpillar Inc. As part of the > machine learning course, a competition was organized among the students and > i have secured 1st place in that competition > <https://www.kaggle.com/c/iitm-cs4011/leaderboard> > I am familiar with deep learning and have completed the fast.ai > <http://fast.ai/> MOOC course along with course offered at our Institute. > > I have read the papers related to the the reinforcement learning algorithms > mentioned in the ideas page. I am interested to work in the reinforcement > learning module. > > I have compiled mlpack from source and an looking at the code structure of > the reinforcement learning module. I am unable to find any tickets presently > and hoping that someone could direct me as to how to proceed. > > I have been interested to use reinforcement learning for equity trading and > recurrent reinforcement learning algorithms has interested me. I believe the > stock market is a good environment (POMDP) to test and evaluate the > performance of such algorithms as it is a highly challenging setting. There > are so many agents that are involved in the environment and i feel to develop > reinforcement learning algorithms that could trade efficiently in such a > setting will be an interesting problem.Deep learning algorithms like LSTM, > cannot capture the latency involved in the system and hence cannot make real > time predictions. Reinforcement learning algorithms could however learn how > to interact under the latency constraint to make real time predictions. Some > areas that i see work in this area is to: > Implement latest work(s) in multi-agent reinforcement learning algorithm > Implement Recurrent reinforcement learning algorithm(s) that capture temporal > nature of the environment. Modifications can be made to existing work. > I would like to hear suggestions from mentors what they feel about the idea > suggested and if it seems like an acceptable project to suggest for GSOC. > > Thanks for your time > > Hope to hear from you soon. Feel free to ask for any more details about me or > my work. > > Regards, > > Rohan Saphal > > > <rrl.pdf><RRL > .pdf><07376685.pdf><LvDuZhai.pdf><SSRN-id2594477.pdf>_______________________________________________ > mlpack mailing list > [email protected] <mailto:[email protected]> > http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack > <http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack>
_______________________________________________ mlpack mailing list [email protected] http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack
