The tablet server failed when it tried to do that read. One thing of interest is that the numbers match up. In the test program and the tablet server, the block offset and size is the same : "offset=3456401 csize=107952 rsize=131093". This rules out that the tablet server is simply reading data from the wrong location in the file. However the tablet server failed when it tried to read the block at location 3456401. This seems to show that sometimes the data can be read but other times it can not.
I modified the test program to lookup all rows that exist in the file. I pushed the changes to github. This is more similar to what is happening on the tablet server. You no longer need to pass a row # to the test program, just a file name. Just run LargeRowDirectQuery <file>. Can you try running this and see if an exception occurs? Keith On Mon, Apr 16, 2012 at 9:27 PM, Keys Botzum <[email protected]> wrote: > Keith, > > As requested I have run the new test you provided. > > Here is the output from ./run.py -t large row -v 10 -d > console: > > > > > logs: > > > > > Here is the output from the new test program: > /opt/keith-turner-accumulo-1.4.0-MapR-630654d/bin$ ./accumulo > org.apache.accumulo.server.test.functional.LargeRowDirectQuery 61 > /mapr/my.cluster.com/user/mapr/accumulo-SE-test-04-28878/tables/2/ > > 16 18:10:08,112 [util.NativeCodeLoader] INFO : Loaded the native-hadoop > library > 16 18:10:08,320 [rfile.RFile] DEBUG: Getting block offset=2916275 > csize=108005 rsize=131093 entries=1 key=d_FH?P(Wl`V.D8A4Vyx=[L!u5V6"660/=O > vON3r`>I]-J-y\f8dLv[?zmE30/"M... TRUNCATED : [] 1334170440161 false > 16 18:10:08,324 [rfile.RelativeKey] DEBUG: entering fastSkip() > 16 18:10:08,332 [rfile.RelativeKey] DEBUG: fieldsSame = 0 > 16 18:10:08,332 [rfile.RelativeKey] DEBUG: len = 131072 > 16 18:10:08,339 [rfile.RelativeKey] DEBUG: data = > d_FH?P(Wl`V.D8A4Vyx=[L!u5V6"660/ > 16 18:10:08,339 [rfile.RelativeKey] DEBUG: len = 0 > 16 18:10:08,340 [rfile.RelativeKey] DEBUG: data = > 16 18:10:08,340 [rfile.RelativeKey] DEBUG: len = 0 > 16 18:10:08,340 [rfile.RelativeKey] DEBUG: data = > 16 18:10:08,340 [rfile.RelativeKey] DEBUG: len = 0 > 16 18:10:08,340 [rfile.RelativeKey] DEBUG: data = > 16 18:10:08,340 [rfile.RelativeKey] DEBUG: Read ts 1334170440161 > 16 18:10:08,340 [rfile.RelativeKey] DEBUG: len = 2 > 16 18:10:08,340 [rfile.RelativeKey] DEBUG: data = 61 > top key : d_FH?P(Wl`V.D8A4Vyx=[L!u5V6"660/=O vON3r`>I]-J-y\f8dLv[?zmE30/"M... > TRUNCATED : [] 1334170440161 false > 16 18:10:08,369 [rfile.RFile] DEBUG: Getting block offset=3456401 > csize=107952 rsize=131093 entries=1 > key=eZ'n/^E<5GmL<8k2\zVe;4P"3nC<%2Ic69c?.B@KmUDhLn?OmZ&$xgwTLc{x4if:... > TRUNCATED : [] 1334170440132 false > 16 18:10:08,376 [rfile.RelativeKey] DEBUG: fieldsSame = 0 > 16 18:10:08,376 [rfile.RelativeKey] DEBUG: len = 131072 > 16 18:10:08,378 [rfile.RelativeKey] DEBUG: data = > eZ'n/^E<5GmL<8k2\zVe;4P"3nC<%2Ic > 16 18:10:08,378 [rfile.RelativeKey] DEBUG: len = 0 > 16 18:10:08,378 [rfile.RelativeKey] DEBUG: data = > 16 18:10:08,378 [rfile.RelativeKey] DEBUG: len = 0 > 16 18:10:08,378 [rfile.RelativeKey] DEBUG: data = > 16 18:10:08,378 [rfile.RelativeKey] DEBUG: len = 0 > 16 18:10:08,378 [rfile.RelativeKey] DEBUG: data = > 16 18:10:08,378 [rfile.RelativeKey] DEBUG: Read ts 1334170440132 > > Is this what you expected? It seems rather brief. Should I run it > differently? Out of curiosity I ran it using 60, 62, and 70 as input just to > see what happens: > > 60: > > 16 18:13:34,076 [util.NativeCodeLoader] INFO : Loaded the native-hadoop > library > > 62: > > 16 18:14:07,656 [util.NativeCodeLoader] INFO : Loaded the native-hadoop > library > > 70: > > 16 18:15:49,995 [util.NativeCodeLoader] INFO : Loaded the native-hadoop > library > > > Not sure what this means. > > > > > Note that I ran the auto test twice. The reason I did so is because the first > run produced an error I've never seen before. Here's the error: > …. Verify Call Count 2 > …. > Creating Range at row 96 initial bytes are: )L#W+688U29C81-#4okb#-liSsd[!MB7 > key = )L#W+688U29C81-#4okb#-liSsd[!MB7VO;"*nv/1LN546++1Vu(`deul$`h08,Z... > TRUNCATED : [] 1334170440096 false > Creating Range at row 97 initial bytes are: N`+y-M`C6(hjBsSoLB4xEhrU{x9+0UXL > key = N`+y-M`C6(hjBsSoLB4xEhrU{x9+0UXLgd%J!jUC"1"b#gMAJH-+.R(Z\JE=j+fK... > TRUNCATED : [] 1334170440097 false > Creating Range at row 98 initial bytes are: ^+u6DeEB*3quaF)*yr*4h3p?\"?GR^_M > key = ^+u6DeEB*3quaF)*yr*4h3p?\"?GR^_Mc0)D+g!i&_6YYOe*NMGaES=w9' /Ifd5... > TRUNCATED : [] 1334170440098 false > Creating Range at row 99 initial bytes are: '?[3F /4g=?/#-U3IFa;.n <('vgazD9 > key = '?[3F /4g=?/#-U3IFa;.n <('vgazD9`DY96xKZsD0:H$)CS>Q%[TG()MKR'Y<M... > TRUNCATED : [] 1334170440099 false > 16 17:58:24,809 [admin.TableOperations] INFO : Problem with metadata table, > it has a hole / != * ... retrying ... > ERROR:test.auto:This looks like a stack trace: Thread > "org.apache.accumulo.server.test.functional.FunctionalTest" died null > java.lang.reflect.InvocationTargetException > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.accumulo.start.Main$1.run(Main.java:89) > at java.lang.Thread.run(Thread.java:662) > Caused by: java.lang.Exception: # of table splits points out of range, > #splits=10 table=lr min=1 max=9 > at > org.apache.accumulo.server.test.functional.FunctionalTest.checkSplits(FunctionalTest.java:216) > at > org.apache.accumulo.server.test.functional.LargeRowTest.test1(LargeRowTest.java:98) > at > org.apache.accumulo.server.test.functional.LargeRowTest.run(LargeRowTest.java:86) > at > org.apache.accumulo.server.test.functional.FunctionalTest.main(FunctionalTest.java:312) > ... 6 more > > Here's the same sequence from the run that yields the "expected" error: > > …. Verify Call Count 2 > …. > DEBUG:test.auto:out: Creating Range at row 98 initial bytes are: > ^+u6DeEB*3quaF)*yr*4h3p?\"?GR^_M > DEBUG:test.auto:out: key = > ^+u6DeEB*3quaF)*yr*4h3p?\"?GR^_Mc0)D+g!i&_6YYOe*NMGaES=w9' /Ifd5... TRUNCATED > : [] 1334170440098 false > DEBUG:test.auto:out: Creating Range at row 99 initial bytes are: '?[3F > /4g=?/#-U3IFa;.n <('vgazD9 > DEBUG:test.auto:out: key = '?[3F /4g=?/#-U3IFa;.n > <('vgazD9`DY96xKZsD0:H$)CS>Q%[TG()MKR'Y<M... TRUNCATED : [] 1334170440099 > false > DEBUG:test.auto:out: Verify Call Count 3 > > > There is no corresponding error in the server logs that I can see but I've > save them in case you are curious. I have the logs in case you are curious. > This might be a spurious error but I include it just in case it gives you an > idea. > > In both runs there are numerous "incorrect header check" messages in the > logs. That might be significant. > > If you need anything, do not hesitate to ask. > > Thanks again, > Keys > ________________________________ > Keys Botzum > Senior Principal Technologist > WW Systems Engineering > [email protected] > 443-718-0098 > MapR Technologies > http://www.mapr.com > > > > On Apr 16, 2012, at 6:32 PM, Keith Turner wrote: >
