Dear team,
I am trying to load 25 million dataset (1.3 Gb) of sample call data onto
riak..its a 4-quad core ---1.5 TB storage 2-node raik cluster...takes real
5671m12.812s.please suggest the solutions for the betterment of the
same...5671m12.812s is quite huge...we deal with bigdata and I need to store
and test 165 GB on the riak..if so I may take years for loading I guess with
the present scenario...loaded 165 GB on to mongodb and got the results..for
comparative performance study of mongodb and riak db ...please do assist me
with the same .
using the following code for loading :
#!/usr/local/bin/escript
main([Filename]) ->
{ok, Data} = file:read_file(Filename),
Lines = tl(re:split(Data, "\r?\n", [{return, binary},trim])),
lists:foreach(fun(L) -> LS = re:split(L, ","), format_and_insert(LS) end,
Lines).
format_and_insert(Line) ->
JSON =
io_lib:format("{\"id\":\"~s\",\"phonenumber\":~s,\"callednumber\":~s,\"starttime\":~s,\"endtime\":~s,\"status\":~s}",
Line),
Command = io_lib:format("curl -X PUT
http://10.232.5.169:8098/riak/CustCalls25m/~s -d '~s' -H 'content-type:
application/json'", [hd(Line),JSON]),
io:format("Inserting: ~s~n", [hd(Line)]),
os:cmd(Command).
[hadoop@CTSINGMRGTO data]$ time ./load_data25m CustCalls25m.csv >> 25m.txt &
[3] 32354
[hadoop@CTSINGMRGTO data]$
real 5671m12.812s
user 1725m31.862s
sys 3074m42.135s
[hadoop@CTSINGMRGTO data]$
[hadoop@CTSINGMRGTO data]$ tail -4 25m.txt
Inserting: 24999997
Inserting: 24999998
Inserting: 24999999
Inserting: 25000000
[hadoop@CTSINGMRGTO data]$
This e-mail and any files transmitted with it are for the sole use of the
intended recipient(s) and may contain confidential and privileged information.
If you are not the intended recipient(s), please reply to the sender and
destroy all copies of the original message. Any unauthorized review, use,
disclosure, dissemination, forwarding, printing or copying of this email,
and/or any action taken in reliance on the contents of this e-mail is strictly
prohibited and may be unlawful.
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com