Hi.

Sorry if this is a bit long, but perhaps it may be interesting to one or two.

On Monday, 22 December 2014 at 22:00:36 UTC, Daniel Davidson wrote:
On Monday, 22 December 2014 at 19:25:51 UTC, aldanor wrote:
On Monday, 22 December 2014 at 17:28:39 UTC, Daniel Davidson wrote:
I don't see D attempting to tackle that at this point.
If the bulk of the work for the "data sciences" piece is the maths, which I believe it is, then the attraction of D as a "data sciences" platform is muted. If the bulk of the work is preprocessing data to get to an all numbers world, then in that space D might shine.
That is one of my points exactly -- the "bulk of the work", as you put it, is quite often the data processing/preprocessing pipeline (all the way from raw data parsing, aggregation, validation and storage to data retrieval, feature extraction, and then serialization, various persistency models, etc).

I don't know about low frequency which is why I asked about Winton. Some of this is true in HFT but it is tough to break that pipeline that exists in C++. Take live trading vs backtesting: you require all that data processing before getting to the math of it to be as low latency as possible for live trading which is why you use C++ in the first place. To break into that pipeline with another language like D to add value, say for backtesting, is risky not just because the duplication of development cost but also the risk of live not matching backtesting.

Maybe you have some ideas in mind where D would help that data processing pipeline, so some specifics might help?

I have been working as a PM for quantish buy side places since 98, after starting in a quant trading role on sell side in 96, with my first research summer job in 93. Over time I have become less quant and more discretionary, so I am less in touch with the techniques the cool kids are using when it doesn't relate to what I do. But more generally there is a kind of silo mentality where in a big firm people in different groups don't know much about what the guy sitting at the next bank of desks might be doing, and even within groups the free flow of ideas might be a lot less than you might think Against that, firms with a pure research orientation may be a touch different, which just goes hex again to say that from the outside it may be difficult to make useful generalisations.

A friend of mine who wrote certain parts of the networking stack in linux is interviewing with HFT firms now, so I may have a better idea about whether D might be of interest. He has heard of D but suggests Java instead. (As a general option, not for HFT). Even smart people can fail to appreciate beauty ;)

I think its public that GS use a python like language internally, JPM do use python for what you would expect, and so do AHL (one of the largest lower freq quant firms). More generally, in every field, but especially in finance, it seems like the data processing aspect is going to be key - not just a necessary evil. Yes, once you have it up and running you can tick it off, but it is going to be some years before you start to tick off items faster than they appear. Look at what Bridgewater are doing with gauging real time economic activity (and look at Google Flu prediction if one starts to get too giddy - it worked and then didn't).

There is a spectrum of different qualities of data. What is most objective is not necessarily what is most interesting. Yet work on affect, media, and sentiment analysis is in its very early stages. One can do much better than just affect bad, buy stocks once they stop going down... Someone that asked me to help with something are close to Twitter, and I have heard the number of firms and rough breakdown by sector taking their full feed. It is shockingly small in the financial services field, and that's probably in part just that it takes people time to figure out something new.

Ravenpack do interesting work from the point of view of a practitioner, and I heard a talk by their former technical architect, and he really seemed to know his stuff. Not sure what they use as a platform.

I can't see why the choice of language will affect your back testing results (except that it is painful to write good algorithms in a klunky language and risk of bugs higher - but that isn't what you meant).

Anyway, back to D and finance. I think this mental image people have of back testing as being the originating driver of research may be mistaken. Its funny but sometimes it seems the moment you take a scientist out of his lab and put him on a trading floor he wants to know if such and such beats transaction costs. But what you are trying to do is understand certain dynamics, and one needs to understand that markets are non linear and have highly unstable parameters. So one must be careful about just jumping to a back test. (And then of course, questions of risk management and transaction costs really matter also).

To a certain extent one must recognise that the asset management business has a funny nature. (This does not apply to many HFT firms that manage partners money), It doesn't take an army to make a lot of money with good people because of the intrinsic intellectual leverage of the business. But to do that one needs capital, and investors expect to see something tangible for the fees if you are managing size. Warren Buffett gets away with having a tiny organisation because he is Buffett, but that may be harder for a quant firm. So since intelligent enough people are cheap, and investors want you to hire people, it can be tempting to hire that army after all and set them to work on projects that certainly cover their costs but really may not be big determinants of variations in investment outcomes. Ie one shouldn't mistake the number of projects for what is truly important.

I agree that it is setting up and keeping everything in production running smoothly that creates a challenge. So it's not just a question of doing a few studies in R. And the more ways of looking at the world, the harder you have to think about how to combine them. Spreadsheets don't cut the mustard anymore - they haven't for years, yet it emerged even recently with the JPM whale that lack of integrity in the spreadsheet worsened communication problems between departments (risk especially). Maybe pypy and numpy will pick up all of slack, but I am not so sure.

In spreadsheet world (where one is a user, not a pro), one never finishes and says finally I am done building sheets. One question leads to another in the face of an unfolding and generative reality. It's the same with quant tools for trading. Perhaps that means value to tooling suited to rapid iteration and building of robust code that won't need later to be totally rewritten from scratch later.

At one very big US hf I worked with, the tools were initially written in Perl (some years back). They weren't pretty, but they worked, and were fast and robust enough. I has many new features I needed for my trading strategy. But the owner - who liked to read about ideas on the internet - came to the conclusion that Perl was not institutional quality and that we should therefore cease new development and rewrite everything in C++. Two years later a new guy took over the larger group, and one way or the other everyone left. I never got my new tools, and that certainly didn't help on the investment front. After he left a year after that they scrapped the entire code base and bought Murex as nobody could understand what they had.

If we had had D then, its possible the outcome might have been different.

So in any case, hard to generalise, and better to pick a few sympathetic people that see in D a possible solution to their pain, and use patterns will emerge organically out of that. I am happy to help where I can, and that is somewhat my own perspective - maybe D can help me solve my pain of tools not up to scratch because good investment tool design requires investment and technology skills to be combined in one person whereas each of these two are rare found on their own. (D makes a vast project closer to brave than foolhardy),

It would certainly be nice to have matrices, but I also don't think it would be right to say D is dead in water here because it is so far behind. It also seems like the cost of writing such a library is v small vs possible benefit.

One final thought. It's very hard to hire good young people. We had 1500 cvs for one job with very impressive backgrounds - French grande ecoles, and the like. But ask a chap how he would sort a list of books without a library, and results were shocking, seems like looking amongst D programmers is a nice heuristic, although perhaps the pool is too small for now. Not hiring now, but was thinking about for future.

Reply via email to