[ 
https://issues.apache.org/jira/browse/HBASE-16074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15363515#comment-15363515
 ] 

Enis Soztutar commented on HBASE-16074:
---------------------------------------

One trick I used to do with ITBLL is to run with this config: 
{code}
    <property>
      <name>hbase.master.hfilecleaner.ttl</name>
      <value>604800000</value>
      <!-- 7 days-->
    </property>
    <property>
      <name>hbase.master.logcleaner.ttl</name>
      <value>604800000</value>
      <!-- 7 days-->
    </property>
    <property>
      <name>hbase.region.archive.recovered.edits</name>
      <value>true</value>
    </property>
{code} 

which will keep ALL Hfiles, WALs and recovered edits around in the archive. 
Then usually, I find one of the missing row keys, and search it through all the 
HFiles and WAL files, and recoveredEdits. The Search tool in ITBLL does this, 
but it has been some time I used it. 

You should also enable DEBUG level logging everywhere. 

> ITBLL fails, reports lost big or tiny families
> ----------------------------------------------
>
>                 Key: HBASE-16074
>                 URL: https://issues.apache.org/jira/browse/HBASE-16074
>             Project: HBase
>          Issue Type: Bug
>          Components: integration tests
>    Affects Versions: 1.3.0, 0.98.20
>            Reporter: Mikhail Antonov
>            Assignee: Mikhail Antonov
>            Priority: Blocker
>             Fix For: 2.0.0, 1.3.0, 1.4.0, 0.98.21
>
>         Attachments: 16074.test.branch-1.3.patch, 16074.test.patch, 
> HBASE-16074.branch-1.3.001.patch, HBASE-16074.branch-1.3.002.patch, 
> HBASE-16074.branch-1.3.003.patch, HBASE-16074.branch-1.3.003.patch, 
> changes_to_stress_ITBLL.patch, changes_to_stress_ITBLL__a_bit_relaxed_.patch, 
> itbll log with failure, itbll log with success
>
>
> Underlying MR jobs succeed but I'm seeing the following in the logs (mid-size 
> distributed test cluster):
> ERROR test.IntegrationTestBigLinkedList$Verify: Found nodes which lost big or 
> tiny families, count=164
> I do not know exactly yet whether it's a bug, a test issue or env setup 
> issue, but need figure it out. Opening this to raise awareness and see if 
> someone saw that recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to