Great. Thanks Russell.. if you need me to do something.. feel free to ask.
Cheers Simon On Wed, Jul 30, 2014 at 10:19:56AM +0100, Russell Brown wrote: > Thanks Simon, > > I’m going to spend a some time on this day. > > Cheers > > Russell > > On 30 Jul 2014, at 10:05, Effenberg, Simon <[email protected]> wrote: > > > Hi Russel, > > > > still one machine out of 13 is on wheezy and the rest on squeeze but the > > software is the same and basho is providing even the erlang stuff. So > > their should no real difference inside the application. > > > > And the errors are almost the same (except the async_write/read > > difference). > > > > I paste them: > > > > ---------- node 1 ----------- > > > > 2014-07-30 06:16:07.728 UTC [info] > > <0.14871.336>@riak_kv_2i_aae:next_partition:160 Finished 2i repair: > > Total partitions: 1 > > Finished partitions: 1 > > Speed: 100 > > Total 2i items scanned: 0 > > Total tree objects: 0 > > Total objects fixed: 0 > > With errors: > > Partition: 125597796958124469533129165311555572001681702912 > > Error: index_scan_timeout > > > > > > 2014-07-30 06:16:07.728 UTC [error] <0.1525.0> gen_server <0.1525.0> > > terminated with reason: bad argument in call to > > eleveldb:async_write(#Ref<0.0.324.211123>, <<>>, > > [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97 > > ,116,105,111,110,95,115,101,99,114,...>>,...}], []) in eleveldb:write/3 > > line 155 > > 2014-07-30 06:16:07.728 UTC [error] <0.1525.0> CRASH REPORT Process > > <0.1525.0> with 0 neighbours exited with reason: bad argument in call to > > eleveldb:async_write(#Ref<0.0.324.211123>, <<>>, > > [{put,<<131,104,2,109,0,0,0,20,99,11 > > 1,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}], []) > > in eleveldb:write/3 line 155 in gen_server:terminate/6 line 747 > > 2014-07-30 06:16:07.728 UTC [error] <0.1517.0> Supervisor > > {<0.1517.0>,poolboy_sup} had child riak_core_vnode_worker started with > > {riak_core_vnode_worker,start_link,undefined} at <0.1525.0> exit with > > reason bad argument in call > > to eleveldb:async_write(#Ref<0.0.324.211123>, <<>>, > > [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}], > > []) in eleveldb:write/3 line 155 in context child_terminated > > > > > > ---------- node 2 ----------- > > > > 2014-07-30 06:16:07.791 UTC [info] > > <0.8083.314>@riak_kv_2i_aae:next_partition:160 Finished 2i repair: > > Total partitions: 1 > > Finished partitions: 1 > > Speed: 100 > > Total 2i items scanned: 0 > > Total tree objects: 0 > > Total objects fixed: 0 > > With errors: > > Partition: 622279994019798508141412682679979879462877528064 > > Error: index_scan_timeout > > > > > > 2014-07-30 06:16:07.791 UTC [error] <0.1884.0> gen_server <0.1884.0> > > terminated with reason: bad argument in call to > > eleveldb:async_write(#Ref<0.0.318.96628>, <<>>, > > [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97, > > 116,105,111,110,95,115,101,99,114,...>>,...}], []) in eleveldb:write/3 line > > 155 > > 2014-07-30 06:16:07.791 UTC [error] <0.1884.0> CRASH REPORT Process > > <0.1884.0> with 0 neighbours exited with reason: bad argument in call to > > eleveldb:async_write(#Ref<0.0.318.96628>, <<>>, > > [{put,<<131,104,2,109,0,0,0,20,99,111 > > ,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}], []) > > in eleveldb:write/3 line 155 in gen_server:terminate/6 line 747 > > 2014-07-30 06:16:07.792 UTC [error] <0.1875.0> Supervisor > > {<0.1875.0>,poolboy_sup} had child riak_core_vnode_worker started with > > {riak_core_vnode_worker,start_link,undefined} at <0.1884.0> exit with > > reason bad argument in call > > to eleveldb:async_write(#Ref<0.0.318.96628>, <<>>, > > [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}], > > []) in eleveldb:write/3 line 155 in context child_terminated > > > > ---------- node 3 ----------- > > > > 2014-07-30 06:17:42.679 UTC [info] > > <0.15746.299>@riak_kv_2i_aae:next_partition:160 Finished 2i repair: > > Total partitions: 1 > > Finished partitions: 1 > > Speed: 100 > > Total 2i items scanned: 0 > > Total tree objects: 0 > > Total objects fixed: 0 > > With errors: > > Partition: 291158529312015815735890337767697007822080311296 > > Error: index_scan_timeout > > > > > > 2014-07-30 06:17:42.679 UTC [error] <0.975.0> gen_server <0.975.0> > > terminated with reason: bad argument in call to > > eleveldb:async_write(#Ref<0.0.2075.159423>, <<>>, > > [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}], > > []) in eleveldb:write/3 line 155 > > 2014-07-30 06:17:42.679 UTC [error] <0.975.0> CRASH REPORT Process > > <0.975.0> with 0 neighbours exited with reason: bad argument in call to > > eleveldb:async_write(#Ref<0.0.2075.159423>, <<>>, > > [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}], > > []) in eleveldb:write/3 line 155 in gen_server:terminate/6 line 747 > > 2014-07-30 06:17:42.679 UTC [error] <0.969.0> Supervisor > > {<0.969.0>,poolboy_sup} had child riak_core_vnode_worker started with > > {riak_core_vnode_worker,start_link,undefined} at <0.975.0> exit with reason > > bad argument in call to eleveldb:async_write(#Ref<0.0.2075.159423>, <<>>, > > [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}], > > []) in eleveldb:write/3 line 155 in context child_terminated > > > > ---------- node 4 ----------- > > > > 2014-07-30 06:16:10.004 UTC [info] > > <0.28895.382>@riak_kv_2i_aae:next_partition:160 Finished 2i repair: > > Total partitions: 1 > > Finished partitions: 1 > > Speed: 100 > > Total 2i items scanned: 0 > > Total tree objects: 0 > > Total objects fixed: 0 > > With errors: > > Partition: 319703483166135013357056057156686910549735243776 > > Error: index_scan_timeout > > > > > > 2014-07-30 06:16:10.004 UTC [error] <0.1580.0> gen_server <0.1580.0> > > terminated with reason: bad argument in call to > > eleveldb:async_write(#Ref<0.0.367.155781>, <<>>, > > [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}], > > []) in eleveldb:write/3 line 155 > > 2014-07-30 06:16:10.004 UTC [error] <0.1580.0> CRASH REPORT Process > > <0.1580.0> with 0 neighbours exited with reason: bad argument in call to > > eleveldb:async_write(#Ref<0.0.367.155781>, <<>>, > > [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}], > > []) in eleveldb:write/3 line 155 in gen_server:terminate/6 line 747 > > 2014-07-30 06:16:10.005 UTC [error] <0.1570.0> Supervisor > > {<0.1570.0>,poolboy_sup} had child riak_core_vnode_worker started with > > {riak_core_vnode_worker,start_link,undefined} at <0.1580.0> exit with > > reason bad argument in call to eleveldb:async_write(#Ref<0.0.367.155781>, > > <<>>, > > [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}], > > []) in eleveldb:write/3 line 155 in context child_terminated > > > > ---------- node 5 ----------- > > > > 2014-07-30 06:16:09.191 UTC [info] > > <0.15985.355>@riak_kv_2i_aae:next_partition:160 Finished 2i repair: > > Total partitions: 1 > > Finished partitions: 1 > > Speed: 100 > > Total 2i items scanned: 0 > > Total tree objects: 0 > > Total objects fixed: 0 > > With errors: > > Partition: 833512652540280570538039006158505159647524028416 > > Error: index_scan_timeout > > > > > > 2014-07-30 06:16:09.191 UTC [error] <0.1601.0> gen_server <0.1601.0> > > terminated with reason: bad argument in call to > > eleveldb:async_get(#Ref<0.0.351.26505>, <<>>, > > <<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,101,116,...>>, > > []) in eleveldb:get/3 line 143 > > 2014-07-30 06:16:09.191 UTC [error] <0.1601.0> CRASH REPORT Process > > <0.1601.0> with 0 neighbours exited with reason: bad argument in call to > > eleveldb:async_get(#Ref<0.0.351.26505>, <<>>, > > <<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,101,116,...>>, > > []) in eleveldb:get/3 line 143 in gen_server:terminate/6 line 747 > > 2014-07-30 06:16:09.192 UTC [error] <0.1598.0> Supervisor > > {<0.1598.0>,poolboy_sup} had child riak_core_vnode_worker started with > > {riak_core_vnode_worker,start_link,undefined} at <0.1601.0> exit with > > reason bad argument in call to eleveldb:async_get(#Ref<0.0.351.26505>, > > <<>>, > > <<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,101,116,...>>, > > []) in eleveldb:get/3 line 143 in context child_terminated > > > > ---------- node 6 ----------- > > > > 2014-07-30 06:16:09.154 UTC [info] > > <0.32042.379>@riak_kv_2i_aae:next_partition:160 Finished 2i repair: > > Total partitions: 1 > > Finished partitions: 1 > > Speed: 100 > > Total 2i items scanned: 0 > > Total tree objects: 0 > > Total objects fixed: 0 > > With errors: > > Partition: 34253944624943037145398863266787883273185918976 > > Error: index_scan_timeout > > > > > > 2014-07-30 06:16:09.154 UTC [error] <0.4086.0> gen_server <0.4086.0> > > terminated with reason: bad argument in call to > > eleveldb:async_get(#Ref<0.0.2698.198008>, <<>>, > > <<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,101,116,...>>, > > []) in eleveldb:get/3 line 143 > > 2014-07-30 06:16:09.154 UTC [error] <0.4086.0> CRASH REPORT Process > > <0.4086.0> with 0 neighbours exited with reason: bad argument in call to > > eleveldb:async_get(#Ref<0.0.2698.198008>, <<>>, > > <<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,101,116,...>>, > > []) in eleveldb:get/3 line 143 in gen_server:terminate/6 line 747 > > 2014-07-30 06:16:09.154 UTC [error] <0.4085.0> Supervisor > > {<0.4085.0>,poolboy_sup} had child riak_core_vnode_worker started with > > {riak_core_vnode_worker,start_link,undefined} at <0.4086.0> exit with > > reason bad argument in call to eleveldb:async_get(#Ref<0.0.2698.198008>, > > <<>>, > > <<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,101,116,...>>, > > []) in eleveldb:get/3 line 143 in context child_terminated > > > > On Wed, Jul 30, 2014 at 09:50:22AM +0100, Russell Brown wrote: > >> Hi Simon, > >> So the earlier “this is on wheezy, rest are on squeeze” thing is no longer > >> a factor? > >> > >> Any and all 2i repair you do ends with the same error? > >> > >> Cheers > >> > >> Russell > >> > >> On 30 Jul 2014, at 07:29, Effenberg, Simon <[email protected]> > >> wrote: > >> > >>> I tried it now with one partition on 6 different machines and everywhere > >>> the same result: index_scan_timeout and the info: bad argument in call to > >>> eleveldb:async_get (2x) or async_write (4x). > >>> > >>> > >>> Von Samsung Mobile gesendet > >>> > >>> > >>> -------- Ursprüngliche Nachricht -------- > >>> Von: "Effenberg, Simon" > >>> Datum:30.07.2014 07:49 (GMT+01:00) > >>> An: bryan hunt > >>> Cc: [email protected] > >>> Betreff: AW: repair-2i stops with "bad argument in call to > >>> eleveldb:async_write" > >>> > >>> Hi, > >>> > >>> I tried it on two different nodes with one partition each. Both multiple > >>> times before the upgrade and after the upgrade. > >>> > >>> I will try it on other machines in a minute but because I tried it > >>> already on two different nodes and one of them is 2 weeks old and stored > >>> on a HP 3par I bet that this is not a disk corruption issue.. > >>> > >>> Simon > >>> > >>> > >>> Von Samsung Mobile gesendet > >>> > >>> > >>> -------- Ursprüngliche Nachricht -------- > >>> Von: bryan hunt > >>> Datum:29.07.2014 18:21 (GMT+01:00) > >>> An: "Effenberg, Simon" > >>> Cc: [email protected] > >>> Betreff: Re: repair-2i stops with "bad argument in call to > >>> eleveldb:async_write" > >>> > >>> Hi Simon, > >>> > >>> Does the problem persist if you run it again? > >>> > >>> Does it happen if you run it against any other partition? > >>> > >>> Best Regards, > >>> > >>> Bryan > >>> > >>> > >>> > >>> Bryan Hunt - Client Services Engineer - Basho Technologies Limited - > >>> Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431 > >>> > >>> On 29 Jul 2014, at 09:35, Effenberg, Simon <[email protected]> > >>> wrote: > >>> > >>>> Hi, > >>>> > >>>> we have some issues with 2i queries like that: > >>>> > >>>> seffenberg@kriak46-1:~$ while :; do curl -s > >>>> localhost:8098/buckets/conversation/index/createdat_int/0/23182680 | > >>>> ruby -rjson -e "o = JSON.parse(STDIN.read); puts o['keys'].size"; sleep > >>>> 1; done > >>>> > >>>> 13853 > >>>> 13853 > >>>> 0 > >>>> 557 > >>>> 557 > >>>> 557 > >>>> 13853 > >>>> 0 > >>>> > >>>> > >>>> ... > >>>> > >>>> So I tried to start a repair-2i first on one vnode/partition on one node > >>>> (which is quiet new in the cluster.. 2 weeks or so). > >>>> > >>>> The command is failing with the following log entries: > >>>> > >>>> seffenberg@kriak46-7:~$ sudo riak-admin repair-2i > >>>> 22835963083295358096932575511191922182123945984 > >>>> Will repair 2i on these partitions: > >>>> 22835963083295358096932575511191922182123945984 > >>>> Watch the logs for 2i repair progress reports > >>>> seffenberg@kriak46-7:~$ 2014-07-29 08:20:22.729 UTC [info] > >>>> <0.5929.1061>@riak_kv_2i_aae:init:139 Starting 2i repair at speed 100 > >>>> for partitions [22835963083295358096932575511191922182123945984] > >>>> 2014-07-29 08:20:22.729 UTC [info] > >>>> <0.5930.1061>@riak_kv_2i_aae:repair_partition:257 Acquired lock on > >>>> partition 22835963083295358096932575511191922182123945984 > >>>> 2014-07-29 08:20:22.729 UTC [info] > >>>> <0.5930.1061>@riak_kv_2i_aae:repair_partition:259 Repairing indexes in > >>>> partition 22835963083295358096932575511191922182123945984 > >>>> 2014-07-29 08:20:22.740 UTC [info] > >>>> <0.5930.1061>@riak_kv_2i_aae:create_index_data_db:324 Creating temporary > >>>> database of 2i data in /var/lib/riak/anti_entropy/2i/tmp_db > >>>> 2014-07-29 08:20:22.751 UTC [info] > >>>> <0.5930.1061>@riak_kv_2i_aae:create_index_data_db:361 Grabbing all index > >>>> data for partition 22835963083295358096932575511191922182123945984 > >>>> 2014-07-29 08:25:22.752 UTC [info] > >>>> <0.5929.1061>@riak_kv_2i_aae:next_partition:160 Finished 2i repair: > >>>> Total partitions: 1 > >>>> Finished partitions: 1 > >>>> Speed: 100 > >>>> Total 2i items scanned: 0 > >>>> Total tree objects: 0 > >>>> Total objects fixed: 0 > >>>> With errors: > >>>> Partition: 22835963083295358096932575511191922182123945984 > >>>> Error: index_scan_timeout > >>>> > >>>> > >>>> 2014-07-29 08:25:22.752 UTC [error] <0.4711.1061> gen_server > >>>> <0.4711.1061> terminated with reason: bad argument in call to > >>>> eleveldb:async_write(#Ref<0.0.10120.211816>, <<>>, > >>>> [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}], > >>>> []) in eleveldb:write/3 line 155 > >>>> 2014-07-29 08:25:22.753 UTC [error] <0.4711.1061> CRASH REPORT Process > >>>> <0.4711.1061> with 0 neighbours exited with reason: bad argument in call > >>>> to eleveldb:async_write(#Ref<0.0.10120.211816>, <<>>, > >>>> [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}], > >>>> []) in eleveldb:write/3 line 155 in gen_server:terminate/6 line 747 > >>>> 2014-07-29 08:25:22.753 UTC [error] <0.1031.0> Supervisor > >>>> {<0.1031.0>,poolboy_sup} had child riak_core_vnode_worker started with > >>>> {riak_core_vnode_worker,start_link,undefined} at <0.4711.1061> exit with > >>>> reason bad argument in call to > >>>> eleveldb:async_write(#Ref<0.0.10120.211816>, <<>>, > >>>> [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}], > >>>> []) in eleveldb:write/3 line 155 in context child_terminated > >>>> > >>>> > >>>> Anything I can do about that? What's the issue here? > >>>> > >>>> I'm using Riak 1.4.8 (.deb package). > >>>> > >>>> Cheers > >>>> Simon > >>>> _______________________________________________ > >>>> riak-users mailing list > >>>> [email protected] > >>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >>> > >>> _______________________________________________ > >>> riak-users mailing list > >>> [email protected] > >>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >> > > > > -- > > Simon Effenberg | Site Op | mobile.international GmbH > > > > Phone: + 49. 30. 8109. 7173 > > M-Phone: + 49. 151. 5266. 1558 > > Mail: [email protected] > > Web: www.mobile.de > > > > Marktplatz 1 | 14532 Europarc Dreilinden | Germany > > > > ______________________________________________________ > > Geschäftsführer: Malte Krüger > > HRB Nr.: 18517 P, Amtsgericht Potsdam > > Sitz der Gesellschaft: Kleinmachnow > > ______________________________________________________ > > _______________________________________________ > > riak-users mailing list > > [email protected] > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > -- Simon Effenberg | Site Op | mobile.international GmbH Phone: + 49. 30. 8109. 7173 M-Phone: + 49. 151. 5266. 1558 Mail: [email protected] Web: www.mobile.de Marktplatz 1 | 14532 Europarc Dreilinden | Germany ______________________________________________________ Geschäftsführer: Malte Krüger HRB Nr.: 18517 P, Amtsgericht Potsdam Sitz der Gesellschaft: Kleinmachnow ______________________________________________________ _______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
