Dear team,

I am trying to load 25 million dataset (1.3 Gb)  of sample call data  onto 
riak..its a 4-quad core ---1.5 TB storage 2-node raik cluster...takes  real    
5671m12.812s.please suggest the solutions for the betterment of the 
same...5671m12.812s is quite huge...we deal with bigdata and I need to store 
and test 165 GB on the riak..if so I may take years for loading I guess with 
the present scenario...loaded 165 GB on to mongodb and got the results..for 
comparative performance study of mongodb  and riak db  ...please do assist me 
with the  same .



using the following code for loading :

#!/usr/local/bin/escript
main([Filename]) ->
    {ok, Data} = file:read_file(Filename),
    Lines = tl(re:split(Data, "\r?\n", [{return, binary},trim])),
    lists:foreach(fun(L) -> LS = re:split(L, ","), format_and_insert(LS) end, 
Lines).

format_and_insert(Line) ->
    JSON = 
io_lib:format("{\"id\":\"~s\",\"phonenumber\":~s,\"callednumber\":~s,\"starttime\":~s,\"endtime\":~s,\"status\":~s}",
 Line),
    Command = io_lib:format("curl -X PUT 
http://10.232.5.169:8098/riak/CustCalls25m/~s -d '~s' -H 'content-type: 
application/json'", [hd(Line),JSON]),
    io:format("Inserting: ~s~n", [hd(Line)]),
    os:cmd(Command).

[hadoop@CTSINGMRGTO data]$ time ./load_data25m CustCalls25m.csv >> 25m.txt &
[3] 32354


[hadoop@CTSINGMRGTO data]$
real    5671m12.812s
user    1725m31.862s
sys     3074m42.135s
[hadoop@CTSINGMRGTO data]$

[hadoop@CTSINGMRGTO data]$ tail -4 25m.txt
Inserting: 24999997
Inserting: 24999998
Inserting: 24999999
Inserting: 25000000
[hadoop@CTSINGMRGTO data]$

This e-mail and any files transmitted with it are for the sole use of the 
intended recipient(s) and may contain confidential and privileged information. 
If you are not the intended recipient(s), please reply to the sender and 
destroy all copies of the original message. Any unauthorized review, use, 
disclosure, dissemination, forwarding, printing or copying of this email, 
and/or any action taken in reliance on the contents of this e-mail is strictly 
prohibited and may be unlawful.
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to