Thanks, Doug! I will try this version.

Can you explain where the problem lies? I'm really curious about it.
I looked at commit 2fe2055265c930b001d68d8893be86d14db3d459
but couldn't understand it fully.

Donald

On Fri, Aug 29, 2008 at 7:52 AM, Doug Judd <[EMAIL PROTECTED]> wrote:
> Hi Donald,
>
> I just checked in a fix that I believe was the source of this problem.  Can
> you pull the latest code (see
> http://code.google.com/p/hypertable/wiki/SourceCode?tm=4) and see if this,
> indeed, fixes the problem?  Thanks.
>
> - Doug
>
> On Wed, Aug 27, 2008 at 7:07 PM, Liu Kejia(Donald) <[EMAIL PROTECTED]>
> wrote:
>>
>> I get these ERRORs yesterday in a Range Server's log:
>>
>> 1219818423 ERROR unknown :
>>
>> (/home/ht4/src/hypertable-0.9.0.10-0821/src/cc/Hypertable/Lib/RangeLocator.cc:502)
>> Incomplete METADATA record found in root tablet under row key ''
>> 1219818424 ERROR unknown :
>>
>> (/home/ht4/src/hypertable-0.9.0.10-0821/src/cc/Hypertable/Lib/RangeLocator.cc:502)
>> Incomplete METADATA record found in root tablet under row key ''
>> 1219818426 ERROR unknown :
>>
>> (/home/ht4/src/hypertable-0.9.0.10-0821/src/cc/Hypertable/Lib/RangeLocator.cc:502)
>> Incomplete METADATA record found in root tablet under row key ''
>>
>> Donald
>>
>> 2008/8/28 Liu Kejia(Donald) <[EMAIL PROTECTED]>:
>> > Hi Doug,
>> >
>> > It should be the first insert, though I'm not quite sure.
>> >
>> > I'm trying to reproduce this problem too. I'll let you know if I have
>> > any new findings.
>> >
>> > Donald
>> >
>> > 2008/8/27 Doug Judd <[EMAIL PROTECTED]>:
>> >> Hi Donald,
>> >>
>> >> I'm still trying to reproduce this problem.  So, do you get these
>> >> errors
>> >> immediately?  In other words, you do the following sequence:
>> >>
>> >> 1. load the tables
>> >> 2. drop them
>> >> 3. re-create the tables
>> >> 4. load the new tables
>> >>
>> >> Does the problem occur upon the first insert in step #4?  Or does the
>> >> load
>> >> progress successfully for some time before you see the errors?
>> >>
>> >> - Doug
>> >>
>> >> 2008/8/25 donald <[EMAIL PROTECTED]>
>> >>>
>> >>> Maybe this problem isn't closed yet, in our recent tests, we see the
>> >>> following ERRORs in client side log a few times:
>> >>>
>> >>> 1219669889 ERROR unknown :
>> >>> (/home/ht3/src/hypertable/src/cc/Hypertable/
>> >>> Lib/RangeLocator.cc:204) RangeLocator failed to find metadata for
>> >>> table 'session_raw' row
>> >>> '1E43A1A640D5F22A28964292289C64CD00000000764A909F0000000048A812C0'
>> >>> 1219669889 ERROR unknown :
>> >>> (/home/ht3/src/hypertable/src/cc/Hypertable/
>> >>> Lib/RangeLocator.cc:204) RangeLocator failed to find metadata for
>> >>> table 'session_raw' row
>> >>> '1E43A1A640D5F22A28964292289C64CD00000000764A909F0000000048A812C0'
>> >>> 1219669889 ERROR unknown :
>> >>> (/home/ht3/src/hypertable/src/cc/Hypertable/
>> >>> Lib/RangeLocator.cc:204) RangeLocator failed to find metadata for
>> >>> table 'session_raw' row
>> >>> '1E43A1A640D5F22A28964292289C64CD00000000764A909F0000000048A812C0'
>> >>> 1219669889 ERROR unknown :
>> >>> (/home/ht3/src/hypertable/src/cc/Hypertable/
>> >>> Lib/RangeLocator.cc:204) RangeLocator failed to find metadata for
>> >>> table 'session_raw' row
>> >>> '1E43A1A640D5F22A28964292289C64CD00000000764A909F0000000048A812C0'
>> >>> while inserting,
>> >>> Error: HYPERTABLE request timeout
>> >>>
>> >>> This usually happens when we first drop all tables in hypertable, then
>> >>> creates tables with the same names and tries to insert data into them.
>> >>> I've manually examined the METADATA table and it turns out NO range of
>> >>> this session_raw table has its Location column filled. So I doubt if
>> >>> this is related to the former problem, though I don't see any sign of
>> >>> in the range server logs.
>> >>>
>> >>> Donald
>> >>>
>> >>>
>> >>> On Aug 15, 12:36 pm, "Doug Judd" <[EMAIL PROTECTED]> wrote:
>> >>> > Thats great, thanks Donald!  BTW, I'm glad you reported that error.
>> >>> >  There
>> >>> > was a problem under that error scenario which resulted in a lost
>> >>> > range.
>> >>> >  I
>> >>> > updated the code so that it will re-try 3 times and then abort if it
>> >>> > can't
>> >>> > write the Range transaciton log.  When the RangeServer recovers from
>> >>> > the
>> >>> > abort, it will pick up where it left off in the split process and
>> >>> > the
>> >>> > newly
>> >>> > split off range will appear as it should.  Your reports are
>> >>> > extremely
>> >>> > valuable in stabilizing the code.
>> >>> >
>> >>> > - Doug
>> >>> >
>> >>> > 2008/8/14 donald <[EMAIL PROTECTED]>
>> >>> >
>> >>> >
>> >>> >
>> >>> > > Hi Doug,
>> >>> >
>> >>> > > It seems our hadoop dfs became corrupted during the upgrade from
>> >>> > > 0.15
>> >>> > > to 0.17. I've deleted all blocks manually and reformated namenode.
>> >>> > > After that I've retries the LogLoad test twice, both tests
>> >>> > > finished
>> >>> > > successfully.
>> >>> >
>> >>> > > Thanks very much for your support!
>> >>> >
>> >>> > > Donald
>> >>> >
>> >>> > > On Aug 14, 7:04 pm, "刘可嘉" <[EMAIL PROTECTED]> wrote:
>> >>> > > > This might be the problem:
>> >>> > >https://issues.apache.org/jira/browse/HADOOP-2669
>> >>> >
>> >>> > > > I've just modified hadoop code to add log information upon lease
>> >>> > > > renewal, hopefully I can figure it out later today...
>> >>> >
>> >>> > > > Donald
>> >>> >
>> >>> > > > On Thu, Aug 14, 2008 at 10:42 AM, donald <[EMAIL PROTECTED]>
>> >>> > > > wrote:
>> >>> >
>> >>> > > > > Hi Doug,
>> >>> >
>> >>> > > > > I've tried it twice, there'are no longer request timeouts.
>> >>> > > > > However,
>> >>> > > > > I've run into another problem, HdfsBroker throws exceptions
>> >>> > > > > while
>> >>> > > > > writing to the range txn log.
>> >>> >
>> >>> > > > > In my LogLoad program's log:
>> >>> >
>> >>> > > > > 1218652507 ERROR unknown :
>> >>> > > > > (/home/ht1/src/hypertable-0.9.0.9-alpha/src/
>> >>> > > > > cc/Hypertable/Lib/RangeLocator.cc:204) Incomplete METADATA
>> >>> > > > > record
>> >>> > > > > found in root range under row key 'fbdbbd2a15e39dc7'
>> >>> > > > > 1218652507 ERROR unknown :
>> >>> > > > > (/home/ht1/src/hypertable-0.9.0.9-alpha/src/
>> >>> > > > > cc/Hypertable/Lib/RangeLocator.cc:204) RangeLocator failed to
>> >>> > > > > find
>> >>> > > > > metadata for table 'session_raw' row 'fbd1ecdf4155a4ca'
>> >>> > > > > while inserting,
>> >>> > > > > Error: HYPERTABLE request timeout, Locating range for row =
>> >>> > > > > 'fbd1ecdf4155a4ca'
>> >>> > > > > Time 1218652507 = Thu Aug 14 02:35:07 2008.
>> >>> >
>> >>> > > > > And it turns out the "Location" column of row
>> >>> > > > > 'fbdbbd2a15e39dc7'
>> >>> > > > > is
>> >>> > > > > missing in the METADATA table:
>> >>> > > > > hypertable> select * from METADATA display_timestamps;
>> >>> > > > > [...]
>> >>> > > > > 2008-08-13 18:33:29.089532001   1:fbc575a1f0273073
>> >>> > > > > StartRow        fb97a550e129b18b
>> >>> > > > > 2008-08-13 18:33:29.212569001   1:fbc575a1f0273073
>> >>> > > > > Location        10.65.25.156_38060
>> >>> > > > > 2008-08-13 18:33:51.318250002   1:fbdbbd2a15e39dc7
>> >>> > > > > Files:default   /hypertable/tables/session_raw/default/
>> >>> > > > > 856E25486E631CECC9AC58C5/cs604;
>> >>> >
>> >>> > > > > 2008-08-13 18:33:51.318250001   1:fbdbbd2a15e39dc7
>> >>> > > > > StartRow        fbc575a1f0273073
>> >>> > > > > 2008-08-13 18:33:51.291840001   1:fbf13c13dd675047
>> >>> > > > > Files:default   /hypertable/tables/session_raw/default/
>> >>> > > > > 856E25486E631CECC9AC58C5/cs604;
>> >>> >
>> >>> > > > > 2008-08-13 18:33:51.318250003   1:fbf13c13dd675047
>> >>> > > > > StartRow        fbdbbd2a15e39dc7
>> >>> > > > > 2008-08-13 18:33:29.089532003   1:fbf13c13dd675047
>> >>> > > > > StartRow        fbc575a1f0273073
>> >>> > > > > 2008-08-13 17:12:36.872430001   1:fbf13c13dd675047
>> >>> > > > > StartRow        fb97a550e129b18b
>> >>> > > > > 2008-08-13 17:12:37.048292001   1:fbf13c13dd675047
>> >>> > > > > Location        10.65.25.156_38060
>> >>> > > > > 2008-08-13 18:33:30.496562001   1:fc1f39c4b322dda0
>> >>> > > > > Files:default   /hypertable/tables/session_raw/default/
>> >>> > > > > 234659F6D672D8037D0DF666/cs486;
>> >>> > > > > [...]
>> >>> > > > > These times are in UTC, so 18:33:51 is 02:33:51 in my
>> >>> > > > > timezone.
>> >>> >
>> >>> > > > > So I take a look at the Hypertable.RangeServer.log of its
>> >>> > > > > neighbour
>> >>> > > > > range's location: 10.65.25.156 and find the following ERROR:
>> >>> >
>> >>> > > > > 1218652431 ERROR Hypertable.RangeServer :
>> >>> > > > > (/home/ht1/src/hypertable/
>> >>> > > > > src/cc/Hypertable/RangeServer/MaintenanceQueue.h:111) DFS
>> >>> > > > > BROKER
>> >>> > > > > i/o
>> >>> > > > > error (Error appending 188 bytes to DFS fd 2)
>> >>> > > > > Time 1218652431 = Thu Aug 14 02:33:51 2008.
>> >>> >
>> >>> > > > > And the HdfsBroker.log:
>> >>> > > > > [...]
>> >>> > > > > Aug 14, 2008 12:00:27 AM
>> >>> > > > > org.hypertable.DfsBroker.hadoop.HdfsBroker
>> >>> > > > > Create
>> >>> > > > > INFO: Creating file
>> >>> > > > > '/hypertable/servers/10.65.25.156_38060/log/
>> >>> > > > > range_txn/0.log' handle = 2
>> >>> > > > > [...]
>> >>> > > > > 08/08/14 02:33:29 INFO dfs.DFSClient:
>> >>> > > > > org.apache.hadoop.ipc.RemoteException:
>> >>> > > > > org.apache.hadoop.dfs.LeaseExpiredException: No lease on
>> >>> > > > > /hypertable/
>> >>> > > > > servers/10.65.25.156_38060/log/range_txn/0.log File is not
>> >>> > > > > open
>> >>> > > > > for
>> >>> > > > > writing. [Lease.  Holder: 44 46 53 43 6c 69 65 6e 74 5f 37 39
>> >>> > > > > 36
>> >>> > > > > 31 37
>> >>> > > > > 30 31 36 39, heldlocks: 0, pendingcreates: 6]
>> >>> > > > >        at
>> >>> > > > >
>> >>> > > > >
>> >>> > > > > org.apache.hadoop.dfs.FSNamesystem.checkLease(FSNamesystem.java:1198)
>> >>> > > > >        at
>> >>> >
>> >>> > >
>> >>> > >
>> >>> > > org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:
>> >>> > > > > 1125)
>> >>> > > > >        at
>> >>> > > > > org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:300)
>> >>> > > > >        at sun.reflect.GeneratedMethodAccessor39.invoke(Unknown
>> >>> > > > > Source)
>> >>> > > > >        at
>> >>> >
>> >>> > >
>> >>> > >
>> >>> > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:
>> >>> > > > > 25)
>> >>> > > > >        at java.lang.reflect.Method.invoke(Method.java:585)
>> >>> > > > >        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:446)
>> >>> > > > >        at
>> >>> > > > > org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
>> >>> >
>> >>> > > > >        at org.apache.hadoop.ipc.Client.call(Client.java:557)
>> >>> > > > >        at
>> >>> > > > > org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:212)
>> >>> > > > >        at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown
>> >>> > > > > Source)
>> >>> > > > >        at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown
>> >>> > > > > Source)
>> >>> > > > >        at
>> >>> >
>> >>> > >
>> >>> > >
>> >>> > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:
>> >>> > > > > 25)
>> >>> > > > >        at java.lang.reflect.Method.invoke(Method.java:585)
>> >>> > > > >        at
>> >>> >
>> >>> > >
>> >>> > >
>> >>> > > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:
>> >>> > > > > 82)
>> >>> > > > >        at
>> >>> >
>> >>> > >
>> >>> > >
>> >>> > > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:
>> >>> > > > > 59)
>> >>> > > > >        at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown
>> >>> > > > > Source)
>> >>> > > > >        at org.apache.hadoop.dfs.DFSClient
>> >>> > > > > $DFSOutputStream.locateFollowingBlock(DFSClient.java:2334)
>> >>> > > > >        at org.apache.hadoop.dfs.DFSClient
>> >>> > > > > $DFSOutputStream.nextBlockOutputStream(DFSClient.java:2219)
>> >>> > > > >        at
>> >>> > > > > org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access
>> >>> > > > > $1700(DFSClient.java:1702)
>> >>> > > > >        at org.apache.hadoop.dfs.DFSClient$DFSOutputStream
>> >>> > > > > $DataStreamer.run(DFSClient.java:1842)
>> >>> >
>> >>> > > > > 08/08/14 02:33:29 WARN dfs.DFSClient:
>> >>> > > > > NotReplicatedYetException
>> >>> > > > > sleeping
>> >>> > > > > /hypertable/servers/10.65.25.156_38060/log/range_txn/0.log
>> >>> > > > > retries left 4
>> >>> > > > > [...]
>> >>> > > > > 08/08/14 02:33:30 INFO dfs.DFSClient:
>> >>> > > > > org.apache.hadoop.ipc.RemoteException:
>> >>> > > > > org.apache.hadoop.dfs.LeaseExpiredException: No lease on
>> >>> > > > > /hypertable/
>> >>> > > > > servers/10.65.25.156_38060/log/range_txn/0.log File is not
>> >>> > > > > open
>> >>> > > > > for
>> >>> > > > > writing. [Lease.  Holder: 44 46 53 43 6c 69 65 6e 74 5f 37 39
>> >>> > > > > 36
>> >>> > > > > 31 37
>> >>> > > > > 30 31 36 39, heldlocks: 0, pendingcreates: 8]
>> >>> > > > >        at
>> >>> > > > >
>> >>> > > > >
>> >>> > > > > org.apache.hadoop.dfs.FSNamesystem.checkLease(FSNamesystem.java:1198)
>> >>> > > > >        at
>> >>> >
>> >>> > >
>> >>> > >
>> >>> > > org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:
>> >>> > > > > 1125)
>> >>> > > > >        at
>> >>> > > > > org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:300)
>> >>> > > > >        at sun.reflect.GeneratedMethodAccessor39.invoke(Unknown
>> >>> > > > > Source)
>> >>> > > > >        at
>> >>> >
>> >>> > >
>> >>> > >
>> >>> > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:
>> >>> > > > > 25)
>> >>> > > > >        at java.lang.reflect.Method.invoke(Method.java:585)
>> >>> > > > >        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:446)
>> >>> > > > >        at
>> >>> > > > > org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
>> >>> >
>> >>> > > > >        at org.apache.hadoop.ipc.Client.call(Client.java:557)
>> >>> > > > >        at
>> >>> > > > > org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:212)
>> >>> > > > >        at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown
>> >>> > > > > Source)
>> >>> > > > >        at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown
>> >>> > > > > Source)
>> >>> > > > >        at
>> >>> >
>> >>> > >
>> >>> > >
>> >>> > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:
>> >>> > > > > 25)
>> >>> > > > >        at java.lang.reflect.Method.invoke(Method.java:585)
>> >>> > > > >        at
>> >>> >
>> >>> > >
>> >>> > >
>> >>> > > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:
>> >>> > > > > 82)
>> >>> > > > >        at
>> >>> >
>> >>> > >
>> >>> > >
>> >>> > > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:
>> >>> > > > > 59)
>> >>> > > > >        at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown
>> >>> > > > > Source)
>> >>> > > > >        at org.apache.hadoop.dfs.DFSClient
>> >>> > > > > $DFSOutputStream.locateFollowingBlock(DFSClient.java:2334)
>> >>> > > > >        at org.apache.hadoop.dfs.DFSClient
>> >>> > > > > $DFSOutputStream.nextBlockOutputStream(DFSClient.java:2219)
>> >>> > > > >        at
>> >>> > > > > org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access
>> >>> > > > > $1700(DFSClient.java:1702)
>> >>> > > > >        at org.apache.hadoop.dfs.DFSClient$DFSOutputStream
>> >>> > > > > $DataStreamer.run(DFSClient.java:1842)
>> >>> >
>> >>> > > > > 08/08/14 02:33:30 WARN dfs.DFSClient:
>> >>> > > > > NotReplicatedYetException
>> >>> > > > > sleeping
>> >>> > > > > /hypertable/servers/10.65.25.156_38060/log/range_txn/0.log
>> >>> > > > > retries left 3
>> >>> > > > > [...]
>> >>> > > > > 08/08/14 02:33:31 INFO dfs.DFSClient:
>> >>> > > > > org.apache.hadoop.ipc.RemoteException:
>> >>> > > > > org.apache.hadoop.dfs.LeaseExpiredException: No lease on
>> >>> > > > > /hypertable/
>> >>> > > > > servers/10.65.25.156_38060/log/range_txn/0.log File is not
>> >>> > > > > open
>> >>> > > > > for
>> >>> > > > > writing. [Lease.  Holder: 44 46 53 43 6c 69 65 6e 74 5f 37 39
>> >>> > > > > 36
>> >>> > > > > 31 37
>> >>> > > > > 30 31 36 39, heldlocks: 0, pendingcreates: 6]
>> >>> > > > >        at
>> >>> > > > >
>> >>> > > > >
>> >>> > > > > org.apache.hadoop.dfs.FSNamesystem.checkLease(FSNamesystem.java:1198)
>> >>> > > > >        at
>> >>> >
>> >>> > >
>> >>> > >
>> >>> > > org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:
>> >>> > > > > 1125)
>> >>> > > > >        at
>> >>> > > > > org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:300)
>> >>> > > > >        at sun.reflect.GeneratedMethodAccessor39.invoke(Unknown
>> >>> > > > > Source)
>> >>> > > > >        at
>> >>> >
>> >>> > >
>> >>> > >
>> >>> > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:
>> >>> > > > > 25)
>> >>> > > > >        at java.lang.reflect.Method.invoke(Method.java:585)
>> >>> > > > >        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:446)
>> >>> >
>> >>> > ...
>> >>> >
>> >>> > read more >>
>> >>>
>> >>
>> >>
>> >> >>
>> >>
>> >
>>
>>
>
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Hypertable Development" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/hypertable-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to