Hello Rohan,

> The competition was part of the coursework. It was a classification problem. 
> The
> data we were given had 2500 features, 10,000 samples and 20 classes that we 
> had
> to classify it into. This was a hand engineered dataset by our teaching
> assistants.The data also had missing information. Some of the competition
> winning algorithms like Xgboost failed to perform well. I won the competition 
> by
> using SVM and Neural networks.

Wow, I expected that Xgboost would provide some reasonably good results.

> However, using RL for trading has its own advantages. It can learn directly 
> from
> a simulation environment and will be able to adapt to the latency of the
> environment because it will be receiving negative reward during the latency
> period. This is however not possible by using deep learning techniques which
> cannot work around this latency. Also complex policies can be learn by the 
> agent
> using deep learning techniques that can't be learned by humans.

Thanks for the informations, sounds like you already put some time into the
idea, I'll see if I can take a closer look at the papers in the next days.

> Sent with MailTrack


Tracking who is reading your emails without their consent is unethical.
Please consider not using software like this.

Thanks,
Marcus




> On 25. Feb 2018, at 06:12, ROHAN SAPHAL <[email protected]> wrote:
> 
> 
> Hi Marcus,
> 
> Sorry for the delayed reply. I was traveling the past two days.
> 
> That sounds really cool, what kind of competition was that?
> The competition was part of the coursework. It was a classification problem. 
> The data we were given had 2500 features, 10,000 samples and 20 classes that 
> we had to classify it into. This was a hand engineered dataset by our 
> teaching assistants.The data also had missing information. Some of the 
> competition winning algorithms like Xgboost failed to perform well. I won the 
> competition by using SVM and Neural networks.
> 
> The idea sounds interesting, do you have some particular methods/papers in 
> mind
> you like to work on since the methods listed on the ideas page are just
> suggestions this is could be a GSoC project.
> 
> The two most frequently seen algorithms that have been used is Q-learning and 
> recurrent reinforcement learning. Firstly, I want to mention the challenges 
> in the trading domain,
> Environment: The trading system is a POMDP and consists of multiple other 
> trading agents. We can have two approaches here, we can assume other agents 
> to be part of the environment and then allow the agent to learn in that 
> environment. Otherwise, we could take a multi-agent approach where we try to 
> reverse-engineer the trading strategies of other agents and then learn to 
> exploit them. This moves into  the multi-agent RL domain which is currently 
> an ongoing research field. 
> Action spaces: We can have our agent take discrete action space, which is 
> basically 3 cations of buy, hold or sell. Increasing the complexity, we can 
> have the agent have the amount to be invested which is a continous action 
> space. Further we could also have the agent know when to place the orders and 
> in how much quantity which makes it much more complex. Having this level of 
> complexity is required if needed to make profits on a regular basis.
> Reward function: Although it may seem intuitive to feed the profit/loss as 
> the reward function, it may not be a good idea as this reward is sparse and 
> is not very frequent. We could also feed unrealized profit/loss also which is 
> not so sparse but allows the agent to lean to trade profitably. This however 
> causes bias towards the agent when an actual profit/loss is made. The other 
> possibility is to choose a reward function that tends to reduce the risk 
> involved like sharpe ratio or maximum drawdown. We might have to choose 
> multiple reward functions to trade off between profit and risk.
> However, using RL for trading has its own advantages. It can learn directly 
> from a simulation environment and will be able to adapt to the latency of the 
> environment because it will be receiving negative reward during the latency 
> period. This is however not possible by using deep learning techniques which 
> cannot work around this latency. Also complex policies can be learn by the 
> agent using deep learning techniques that can't be learned by humans.
> 
> I feel a good starting point would be to implement state of the art recurrent 
> reinforcement learning algorithm and then improve on it by incorporating 
> multiple agents, continuous action spaces,etc. Hoping to hear suggestions 
> from mentors.
> 
> PFA some relevant papers.
> 
> 
> 
> Regards,
> 
> Rohan Saphal
> 
> 
> 
> 
> 
> ‌
>  <https://mailtrack.io/> Sent with Mailtrack 
> <https://chrome.google.com/webstore/detail/mailtrack-for-gmail-inbox/ndnaehgpjlnokgebbaldlmgkapkpjkkb?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality>
> 
> On Tue, Feb 20, 2018 at 11:35 PM, ROHAN SAPHAL <[email protected] 
> <mailto:[email protected]>> wrote:
> Hi,
> 
> I am Rohan Saphal, a pre-final year undergraduate from Indian Institute of 
> Technology Madras.
> 
> My research interest is in Artificial Intelligence and specifically in Deep 
> reinforcement learning. 
> I have been working with  Prof. Balaraman Ravindran 
> <https://scholar.google.co.in/citations?user=nGUcGrYAAAAJ&hl=en> in 
> Multi-agent reinforcement learning and will continue to do my final degree 
> thesis project under his guidance.
> I am currently a graduate research intern at Intel labs working on 
> Reinforcement learning. 
> Previously, I was a computer vision intern at Caterpillar Inc. As part of the 
> machine learning course,  a competition was organized among the students and 
> i have secured 1st place in that competition 
> <https://www.kaggle.com/c/iitm-cs4011/leaderboard>
> I am familiar with deep learning and have completed the fast.ai 
> <http://fast.ai/> MOOC course along with course offered at our Institute.  
> 
> I have read the papers related to the the reinforcement learning algorithms 
> mentioned in the ideas page. I am interested to work in the reinforcement 
> learning module.
> 
> I have compiled mlpack from source and an looking at the code structure of 
> the reinforcement learning module. I am unable to find any tickets presently 
> and hoping that someone could direct me as to how to proceed.
> 
> I have been interested to use reinforcement learning for equity trading and  
> recurrent reinforcement learning algorithms has interested me. I believe the 
> stock market is a good environment (POMDP) to test and evaluate the 
> performance of such algorithms as it is a highly challenging setting. There 
> are so many agents that are involved in the environment and i feel to develop 
> reinforcement learning algorithms that could trade efficiently in such a 
> setting will be an interesting problem.Deep learning algorithms like LSTM, 
> cannot capture the latency involved in the system and hence cannot make real 
> time predictions. Reinforcement learning algorithms could however learn how 
> to interact under the latency constraint to make real time predictions. Some 
> areas that i see work in this area is to:
> Implement latest work(s) in multi-agent reinforcement learning algorithm
> Implement Recurrent reinforcement learning algorithm(s) that capture temporal 
> nature of the environment. Modifications can be made to existing work.
> I would like to hear suggestions from mentors what they feel about the idea 
> suggested and if it seems like an acceptable project to suggest for GSOC. 
> 
> Thanks for your time
> 
> Hope to hear from you soon. Feel free to ask for any more details about me or 
> my work.
> 
> Regards,
> 
> Rohan Saphal
> 
> 
> <rrl.pdf><RRL 
> .pdf><07376685.pdf><LvDuZhai.pdf><SSRN-id2594477.pdf>_______________________________________________
> mlpack mailing list
> [email protected] <mailto:[email protected]>
> http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack 
> <http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack>
_______________________________________________
mlpack mailing list
[email protected]
http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack

Reply via email to