On Fri, Jan 28, 2011 at 8:05 AM, Laurent Laborde <kerdez...@gmail.com> wrote: > On Fri, Jan 28, 2011 at 1:12 AM, Namit Jain <nj...@fb.com> wrote: >> Hi Laurent, >> >> 1. Are you saying that _top.sql did not exist in the home directory. >> Or that, _top.sql existed, but hive was not able to read it after loading > > It exist, it's loaded, and i can see it in the hive's warehouse directory. > it's just impossible to query it. > >> 2. I don¹t think reserved words are documented somewhere. Can you file a >> jira for this ? > > Ok; will do that today. > >> 3. The bad row is printed in the task log. >> >> 1. 2011-01-27 11:11:07,046 INFO org.apache.hadoop.fs.FSInputChecker: Found >> checksum error: b[1024, >> 1536]=7374796c653d22666f6e742d73697a653a20313270743b223e3c623e266e6273703b2 >> 66e6273703b266e6273703b202a202838302920416d69656e733a3c2f623e3c2f7370616e3e >> 3c2f7370616e3e5c6e20203c2f703e5c6e20203c703e5c6e202020203c7370616e207374796 >> c653d22666f66742d66616d696c793a2068656c7665746963613b223e3c7370616e20737479 >> 6c653d22666f6e742d73697a653a20313270743b223e3c623e266e6273703b266e6273703b2 >> 66e6273703b266e6273703b266e6273703b266e6273703b266e6273703b266e6273703b266e >> 6273703b206f203132682c2050697175652d6e6971756520646576616e74206c65205265637 >> 46f7261742e3c2f623e3c2f7370616e3e3c2f7370616e3e5c6e20203c2f703e5c6e20203c70 >> 3e5c6e202020203c7370616e207374796c653d22666f6e742d66616d696c793a2068656c766 >> 5746963613b223e3c7370616e207374796c653d22666f6e742d73697a653a20313270743b22 >> 3e3c623e266e6273703b266e6273703b266e6273703b266e6273703b266e6273703b266e627 >> 3703b266e6273703b266e6273703b266e6273703b206f2031346833302c204d6169736f6e20 >> 6465206c612063756c747572652e3c2f623e3c2f7370616e3e3c2f7370616e3e5c6e20203c2 >> f703e5c6e20203c703e5c6e202020203c7370616e207374796c653d > > Is this the actual data ? > >> 2. org.apache.hadoop.fs.ChecksumException: Checksum error: >> /blk_2466764552666222475:of:/user/hive/warehouse/article/article.copy at >> 23446528 > > 23446528 is the line number ? > > thank you
optional question (the previous ones are still open) : is there a way to tell hive to ignore invalid data ? (if the problem is invalid data) -- Laurent "ker2x" Laborde Sysadmin & DBA at http://www.over-blog.com/