For hfileoutputformat, on timeout/failure/kill clean up half-written hfile
--------------------------------------------------------------------------

                 Key: HBASE-2063
                 URL: https://issues.apache.org/jira/browse/HBASE-2063
             Project: Hadoop HBase
          Issue Type: Bug
            Reporter: stack


Below is from mailing list.  Read from bottom to top:

{code}
 I was going to write that perhaps you needed to turn 
mapred.reduce.tasks.speculative.execution off, but if enabling it and things 
work, that would seem to indicate that a our reducer first takes longer than 
the task timeout maximum and secondly, on failure, we should clean up the hfile.

On the first issue, you are using KeyValueSortReducer?  Are your values large?  
We set reducer status every 100 values.  Maybe this is not enough?  We should 
set status more frequently?  If you call context setstatus more frequently, do 
things work w/o speculative execution?
 
On the second, HFileOutputFormat close will set the metadata on the hfile and 
then close it.  On kill, this code is not being called.   Let me see if can do 
something about that (e.g. register a shutdown hook to clean away incomplete 
files -- ).

Thanks,
St.Ack


On Sun, Dec 20, 2009 at 11:26 PM, ChingShen <chingshenc...@gmail.com> wrote:
I think I found a way.
I set the "mapred.reduce.tasks.speculative.execution" to true and output
hfiles again, then successfully load hfiles into hbase.
Is it best solution? or HFileOutputFormat bug?

Shen

On Mon, Dec 21, 2009 at 8:25 AM, ChingShen <chingshenc...@gmail.com> wrote:

> Thanks, stack.
>
> I checked this file that isn't empty. But I found that as long as the
> "Killed Task Attempts" > 0 in reduce phase, and run the loadtable.rb script
> to load hfiles then failed.
> How to avoid this problem?
>
> Thanks.
>
> Shen
>
>
> On Sat, Dec 19, 2009 at 3:49 AM, stack <st...@duboce.net> wrote:
>
>> Check the
>> file
>> hdfs://domU-12-31-39-09-C5-54.compute-1.internal/osm2_hfile/Level4/197894389945760574.
>>  Is it empty?  Was there an error during running of your MR job?  Perhaps
>> a
>> task failed?
>>
>> St.Ack
>>
>>
>>
>> On Thu, Dec 17, 2009 at 9:46 PM, ChingShen <chingshenc...@gmail.com>
>> wrote:
>>
>> > Hi,
>> >  I use the script loadtable.rb to load my hfiles into hbase, but I got
>> an
>> > exception as below.
>> >  Does anyone have any suggestions?
>> >
>> > 09/12/17 23:59:33 INFO loadtable: 18 read firstkey of -3.9290_52.5534
>> from
>> >
>> >
>> hdfs://domU-12-31-39-09-C5-54.compute-1.internal/osm2_hfile/Level4/1978943899457605747
>> > org/apache/hadoop/hbase/io/hfile/HFile.java:1335:in `deserialize':
>> > java.io.IOException: Trailer 'header' is wrong; does the trailer size
>> match
>> > content? (NativeException)
>> >    from org/apache/hadoop/hbase/io/hfile/HFile.java:813:in `readTrailer'
>> >    from org/apache/hadoop/hbase/io/hfile/HFile.java:758:in
>> `loadFileInfo'
>> >    from sun.reflect.GeneratedMethodAccessor7:-1:in `invoke'
>> >    from sun/reflect/DelegatingMethodAccessorImpl.java:25:in `invoke'
>> >    from java/lang/reflect/Method.java:597:in `invoke'
>> >    from org/jruby/javasupport/JavaMethod.java:298:in
>> > `invokeWithExceptionHandling'
>> >    from org/jruby/javasupport/JavaMethod.java:259:in `invoke'
>> >    from org/jruby/java/invokers/InstanceMethodInvoker.java:36:in `call'
>> >     ... 18 levels...
>> >    from org/jruby/Main.java:94:in `main'
>> >    from loadtable.rb:83:in `each'
>> >    from loadtable.rb:83
>> > Complete Java stackTrace
>> > java.io.IOException: Trailer 'header' is wrong; does the trailer size
>> match
>> > content?
>> >    at
>> >
>> >
>> org.apache.hadoop.hbase.io.hfile.HFile$FixedFileTrailer.deserialize(HFile.java:1335)
>> >    at
>> >
>> org.apache.hadoop.hbase.io.hfile.HFile$Reader.readTrailer(HFile.java:813)
>> >    at
>> >
>> org.apache.hadoop.hbase.io.hfile.HFile$Reader.loadFileInfo(HFile.java:758)
>> >    at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>> >    at
>> >
>> >
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> >    at java.lang.reflect.Method.invoke(Method.java:597)
>> >    at
>> >
>> >
>> org.jruby.javasupport.JavaMethod.invokeWithExceptionHandling(JavaMethod.java:298)
>> >    at org.jruby.javasupport.JavaMethod.invoke(JavaMethod.java:259)
>> >    at
>> >
>> >
>> org.jruby.java.invokers.InstanceMethodInvoker.call(InstanceMethodInvoker.java:36)
>> >    at
>> > org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:70)
>> >    at loadtable.ensure_1$RUBY$__ensure___2(loadtable.rb:86)
>> >    at loadtable.block_0$RUBY$__for__(loadtable.rb:85)
>> >    at loadtableBlockCallback$block_0$RUBY$__for__xx1.call(Unknown
>> Source)
>> >    at org.jruby.runtime.CompiledBlock.yield(CompiledBlock.java:102)
>> >    at org.jruby.runtime.Block.yield(Block.java:100)
>> >    at
>> org.jruby.java.proxies.ArrayJavaProxy.each(ArrayJavaProxy.java:112)
>> >    at
>> >
>> >
>> org.jruby.java.proxies.ArrayJavaProxy$i_method_0_0$RUBYINVOKER$each.call(org/jruby/java/proxies/ArrayJavaProxy$i_method_0_0$RUBYINVOKER$each.gen)
>> >    at
>> >
>> >
>> org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:263)
>> >    at
>> >
>> >
>> org.jruby.runtime.callsite.CachingCallSite.callBlock(CachingCallSite.java:81)
>> >    at
>> >
>> >
>> org.jruby.runtime.callsite.CachingCallSite.callIter(CachingCallSite.java:96)
>> >    at loadtable.__file__(loadtable.rb:83)
>> >    at loadtable.load(loadtable.rb)
>> >    at org.jruby.Ruby.runScript(Ruby.java:577)
>> >    at org.jruby.Ruby.runNormally(Ruby.java:480)
>> >    at org.jruby.Ruby.runFromMain(Ruby.java:354)
>> >    at org.jruby.Main.run(Main.java:229)
>> >    at org.jruby.Main.run(Main.java:110)
>> >    at org.jruby.Main.main(Main.java:94)
>> >
>>
>
>
>
>


--
*****************************************************
Ching-Shen Chen
Advanced Technology Center,
Information & Communications Research Lab.
E-mail: chenchings...@itri.org.tw
Tel:+886-3-5915542
*****************************************************

{code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to