Hi Arvind,
Thanks for explanation.

I am newbie so I am not familiar with terms.
Struct implementation is POJO or some thing else.

My guess is struct is a simple POJO . If so then simple POJO represented in 
BYTES will be passed to BytesWritable .
And it should work ?



-Sagar

On Apr 16, 2010, at 9:58 AM, Arvind Prabhakar wrote:

> Sagar,
> 
> Unfortunately it is more complicated than that. The idea behind the record 
> reader implementation is to actually convert the underlying writable into a 
> type that is understood by the SerDe layer. At this time, the SerDe layer 
> seems to understand ByteWritable and Text types. So - if you could take your 
> custom type and emit a ByteWritable that represents a struct implementation 
> of the same, it would work.
> 
> Another alternative which would be simple to implement would be to do the 
> following:
> 
> 1. Modify your custom writable so that it has a toString() method that 
> generates a parsable representation of the fields. For example you could use 
> the JSON representation in your toString() method.
> 
> 2. Create the external table with inputformat 
> 'org.apache.hadoop.mapred.SequenceFileAsTextInputFormat' and  outputformat 
> 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat', mapping the 
> entire value type to a single string column.
> 
> 3. Use the UDFJson to extract the individual attributes from the JSON string 
> that is emitted from the select query. 
> 
> You can use this output to populate a new table that now has the parsed 
> values separated out in the warehouse.
> 
> Arvind
> 
> 
> On Thu, Apr 15, 2010 at 6:01 PM, Sagar Naik <[email protected]> wrote:
> Hi Arvind,
> 
> U guessed it correct.
> 
> We have custom writables.
> I saw the TextRecordReader implementation to get an idea on RecordReader.
> 
> It looks like createRow creates an instance and next(...) populates this 
> instance.
> The createRow returns an instance of Writable.
> 
> Is the Writable Instance same as "struct" from u r reply
> 
> How is this Writable instance mapped to column names ?
> Is there something in commandline syntax which binds the Writable instance to 
> column names and values ?
> Or ObjectInspector will do it magically 
> 
> -Sagar
> On Apr 15, 2010, at 12:00 PM, Arvind Prabhakar wrote:
> 
>> Hi Sagar,
>> 
>> Looks like your source file has custom writable types in it. If that is the 
>> case, implementing a SerDe that works with that type may not be that 
>> straight forward, although doable. 
>> 
>> An alternative would be to implement a custom RecordReader that converts the 
>> value of your custom writable to Struct type which can then be queried 
>> directly.
>> 
>> Arvind
>> 
>> On Thu, Apr 15, 2010 at 1:06 AM, Sagar Naik <[email protected]> wrote:
>> Hi
>> 
>> My data is in the value field of a sequence file.
>> The value field has subfields in it. I am trying to create table using these 
>> subfields.
>> Example:
>> <KEY> <VALUE>
>> <KEY_FIELD1, KEYFIELD 2>  forms the key
>> <VALUE_FIELD1, VALUE_FIELD2, VALUE_FIELD3>.
>> So i am trying to create a table from VALUE_FIELD*
>> 
>> CREATE EXTERNAL TABLE table_name (VALUE_FIELD1 as BIGINT, VALUE_FIELD2 as 
>> string, VALUE_FIELD3 as BIGINT ) STORED AS SEQUENCEFILE;
>> 
>> I am planing to a write a custom SerDe implementation and custom 
>> SequenceFileReader
>> Pl let me knw if I am on the right track.
>> 
>> 
>> -Sagar
>> 
> 
> 

Reply via email to