Re: Ingest speed

Josh Elser Tue, 05 May 2015 10:48:55 -0700

Yes, a BatchWriter is for one table only. If you're writing to multipletables, the MultiTableBatchWriter might be helpful. TheMultiTableBatchWriter does the same thing that managing multipleBatchWriters would do but shares the memory usage.


Are you familiar with Hadoop's MapReduce framework?


http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html

MapReduce jobs accept data from InputFormats and write data toOutputFormats. Specifically, the FileInputFormat allows your MapReducejobs to read data from HDFS and the AccumuloOutputFormat will writeMutations to an Accumulo table. Unless you have many nodes with lots andlots of data constantly flowing in, MapReduce might be overkill. I justthought I'd mention it though.


http://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/client/mapreduce/AccumuloOutputFormat.html

Keep in touch -- wouldn't want to keep you from being able to graduate :)

Revan1988 wrote:

Every one batchWriter is for only one table (isn't it?).
I need to separate my json record in 3 tables (my record came from an IDS so
i have to divide ALERT, DNS and HTTP record type).
So maybe i can use 3 batchWriter... I'll try!!

And what about FileInputFormat and the AccumuloOutputFormat? I'm sorry but i
don't know it very well... do you have any website, pdf or sample that i can
study about this?

Thank you again!
I want to do a good work because it is the project for my graduation of
MSc... but here in my university no one know so much about accumulo.



-----
Andrea Leoni
Italy
Computer Engineering
--
View this message in context: 
http://apache-accumulo.1065345.n5.nabble.com/Ingest-speed-tp14005p14024.html
Sent from the Developers mailing list archive at Nabble.com.

Re: Ingest speed

Reply via email to