On 06/02/13 22:21, bioinfornatics wrote:
Thanks monarch and FG,
i will read your code to see where i failing :-)

I wasn't going to mention this as I thought the CPU usage might be trivial, but if both CPU and IO are factors, then it would probably be beneficial to have a separate IO thread/task.

I guess you'd need a big task: the task would need to load and return n chunks or n lines, rather than just one line at at time, for example, and the processing/parsing thread (main thread or otherwise) could then churn through that while more IO was done.

It would also depend on the size of the file: no point firing up a thread just to read a tiny file that the filesystem can return in a millisecond. If you're talking about 1+ minutes of loading though, a thread should definitely help.

Also, if you don't strictly need to parse the file in order, then you could divide and conquer it by breaking it into more sections/tasks. For example, if you're parsing records, you cold split the file in half, find the remaining parts of the record in the second half, move it to the first, and then process the two halves in two threads. If you've a nice function to do that split cleanly, and n cpus, then just call it some more.



--
Lee

Reply via email to