Re: Ingest speed

Josh Elser Tue, 05 May 2015 08:50:21 -0700

On a single node, you can easily achieve 10s of thousands of key-valueinserts per second. Depending on how many columns are in each row, 600 asecond is rather slow :)

Your loop looks good. Using a single BatchWriter and letting it amortizesending data from your client to the servers will be the most efficient.

If the JSON parsing is the slowest part, you could consider a singlethread reading the file and provide the line to a thread pool, parse theline and add it to some concurrent data structure. You could have aconsumer on that data structure reading each parsed object and sendingit to Accumulo.

Alternatively, this is where MapReduce is a clear win as it's very goodat parallelizing these types of problems. You could use theFileInputFormat and the AccumuloOutputFormat to accomplish this task.


Andrea Leoni wrote:

Thank you for your answer.
Today i tried to create a big command file and push it to shell (about 300k
insert per file). As you said it is too slow for me (about 600 inserted
row/sec)

I'm on Accumulo by just one week. I'm a noob but i'm learning.

Actually my app has to store a large number of data.

The row is the timestamp and the family/qualif are the column... I catch my
data from a JSON file, so my app scan it for new records, parse it and once
for record create a mutation and push it on Accumulo with batchWriter...

Maybe I wrong something that can increase the speed of my inserts.

Actually I:

LOOP
1) read a json line
2) parse it
3) create a mutation
4) put in this mutation the line's information
5) use batchWriter to insert mutation in Accumulo
END LOOP

Is it all right? I now that point 1) and 2) are slow but it's necessary and
i use the fastest json parser i've found online.

Thank you so much again!
(and sorry again for my bad english!)



-----
Andrea Leoni
Italy
Computer Engineering
--
View this message in context: 
http://apache-accumulo.1065345.n5.nabble.com/Ingest-speed-tp14005p14013.html
Sent from the Developers mailing list archive at Nabble.com.

Re: Ingest speed

Reply via email to