Re: Best practice for in memory data?

2007-01-25 Thread Doug Cutting
Johan Oskarsson wrote: Any advice on how to solve this problem? I think your current solutions sound reasonable. Would it be possible to somehow share a hashmap between tasks? Not without running multiple tasks in the same JVM. We could implement a mode where child tasks are run directly

Re: Best practice for in memory data?

2007-01-25 Thread Bryan A. P. Pendleton
There's also code floating around for a Multithreaded MapRunner. This (with appropriate synchronization) would allow a shared HashMap without having to pay the per-simultaneous-map overhead. Another thing that might or might not make sense would be to use memcached for your hashtable. This may

Best practice for in memory data?

2007-01-24 Thread Johan Oskarsson
Hi. Currently some of my map reduce jobs need quick access to additional data to check some input values in the map phase. This data is currently held in memory in a hashmap. It's very quick but as each job starts several jvms the data will be held in memory multiple times. It will also