Re: Storing large files for later processing through hadoop

2015-01-03 Thread Wilm Schumacher
Am 03.01.2015 um 07:07 schrieb Srinivasa T N: Hi Wilm, The reason is that for some auditing purpose, I want to store the original files also. well, then I would use a hdfs cluster for storing, as it seems to be exactly what you need. If you collocate hdfs DataNodes and yarns ResourceManager,

Re: Storing large files for later processing through hadoop

2015-01-02 Thread Wilm Schumacher
Hi, perhaps I totally misunderstood your problem, but why bother with cassandra for storing in the first place? If your MR for hadoop is only run once for each file (as you wrote above), why not copy the data directly to hdfs, run your MR job and use cassandra as sink? As hdfs and yarn are more

Re: 2014 nosql benchmark

2014-12-18 Thread Wilm Schumacher
Hi, I'm always interessted in such benchmark experiments, because the databases evolve so fast, that the race is always open and there is a lot motion in there. And of course I askes myself the same question. And I think that this publication is unreliable. For 4 reasons (from reading very fast,