I recently completed the general guidelines for a future project that I would like to start developing...but I've sort of hit a wall with respect to how to design it. In short, I want to run through approximately 5gigs of financial data, all of which is stored in a large number of text files. Now as far as formatting and data integrity...I would go through and ensure that each file had the required setup so thats not really the issue. The problem I am having is with respect to speed.
The languages I knew the best when coming into this project includes c++ and php. However, I then thought about how long it would take one PC to iterate through everything and figured it would probably take a significant amount of time. As such, I started looking into various languages and python caught my interest the most due to its power and what seems to be ease of use. I was going to initially just use python as a means of creating various indicators (i.e. calculations that would be performed on the data in the file)...however I am leaning towards moving to python entirely mostly due to its gui support. First off, i was wondering if this is a reasonable setup: The entire process would involve a server which manages which pc is processing which set of data (which may be a given text file or the like), and a client application which i would run on a few pc's locally when they aren't in use. I would have a database (sqlite) holding all calculated data of significance. Each client will basically login/connect with the server, request a time interval (i.e. does anything need processed? if so what data should i look at), and then it would update its status with the server which would place a lock on that data set. One thing i was wondering is if it would be worth it to use c++ for the actual iteration through the text file or should i simply use python? While i'm sure that c++ would be faster i am not entirely sure its worth the headache if its not going to save me significant processing time. Another thing is...if i was going to work with python instead of c++, would it be worth it to import all of the data into an sqlite database before hand (for speed issues)? Lastly, as far as the networking goes, i have seen posts and such about something called Pyro (http://pyro.sourceforge.net) and wondered if that was worth looking into for the client/server interaction. I apologize if any of these questions are more lower level, this is simply the first client/server application ive created and am doing so in a language ive never used before ;) Thanks for the help -Tony -- http://mail.python.org/mailman/listinfo/python-list