Hi Arvind, Thanks for explanation. I am newbie so I am not familiar with terms. Struct implementation is POJO or some thing else.
My guess is struct is a simple POJO . If so then simple POJO represented in BYTES will be passed to BytesWritable . And it should work ? -Sagar On Apr 16, 2010, at 9:58 AM, Arvind Prabhakar wrote: > Sagar, > > Unfortunately it is more complicated than that. The idea behind the record > reader implementation is to actually convert the underlying writable into a > type that is understood by the SerDe layer. At this time, the SerDe layer > seems to understand ByteWritable and Text types. So - if you could take your > custom type and emit a ByteWritable that represents a struct implementation > of the same, it would work. > > Another alternative which would be simple to implement would be to do the > following: > > 1. Modify your custom writable so that it has a toString() method that > generates a parsable representation of the fields. For example you could use > the JSON representation in your toString() method. > > 2. Create the external table with inputformat > 'org.apache.hadoop.mapred.SequenceFileAsTextInputFormat' and outputformat > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat', mapping the > entire value type to a single string column. > > 3. Use the UDFJson to extract the individual attributes from the JSON string > that is emitted from the select query. > > You can use this output to populate a new table that now has the parsed > values separated out in the warehouse. > > Arvind > > > On Thu, Apr 15, 2010 at 6:01 PM, Sagar Naik <[email protected]> wrote: > Hi Arvind, > > U guessed it correct. > > We have custom writables. > I saw the TextRecordReader implementation to get an idea on RecordReader. > > It looks like createRow creates an instance and next(...) populates this > instance. > The createRow returns an instance of Writable. > > Is the Writable Instance same as "struct" from u r reply > > How is this Writable instance mapped to column names ? > Is there something in commandline syntax which binds the Writable instance to > column names and values ? > Or ObjectInspector will do it magically > > -Sagar > On Apr 15, 2010, at 12:00 PM, Arvind Prabhakar wrote: > >> Hi Sagar, >> >> Looks like your source file has custom writable types in it. If that is the >> case, implementing a SerDe that works with that type may not be that >> straight forward, although doable. >> >> An alternative would be to implement a custom RecordReader that converts the >> value of your custom writable to Struct type which can then be queried >> directly. >> >> Arvind >> >> On Thu, Apr 15, 2010 at 1:06 AM, Sagar Naik <[email protected]> wrote: >> Hi >> >> My data is in the value field of a sequence file. >> The value field has subfields in it. I am trying to create table using these >> subfields. >> Example: >> <KEY> <VALUE> >> <KEY_FIELD1, KEYFIELD 2> forms the key >> <VALUE_FIELD1, VALUE_FIELD2, VALUE_FIELD3>. >> So i am trying to create a table from VALUE_FIELD* >> >> CREATE EXTERNAL TABLE table_name (VALUE_FIELD1 as BIGINT, VALUE_FIELD2 as >> string, VALUE_FIELD3 as BIGINT ) STORED AS SEQUENCEFILE; >> >> I am planing to a write a custom SerDe implementation and custom >> SequenceFileReader >> Pl let me knw if I am on the right track. >> >> >> -Sagar >> > >
