Re: replace CsvToKeyValueMapper with my implementation

James Taylor Thu, 29 Oct 2015 10:34:09 -0700

I seem to remember you starting down that path, Gabriel - a kind of
pluggable transformation for each row. It wasn't pluggable on the input
format, but that's a nice idea too, Ravi. I'm not sure if this is what Noam
needs or if it's something else.


Probably good to discuss a bit more at the use case level to understand the
specifics a bit more.

On Thu, Oct 29, 2015 at 9:17 AM, Ravi Kiran <[email protected]>
wrote:

> It would be great if we can provide an api and have end users provided
> implementation on how to parse each record . This way, we can move away
> with only bulk loading csv and have json and other formats of input bulk
> loaded onto phoenix tables.
>
> I can take that one up. Would it be something the community like as a
> feature ?
>
>
>
>
>
> On Thu, Oct 29, 2015 at 8:10 AM, Gabriel Reid <[email protected]>
> wrote:
>
>> Hi Noam,
>>
>> That specific piece of code in CsvBulkLoadTool that you referred to
>> allows packaging the CsvBulkLoadTool within a different job jar file,
>> but won't allow setting a different mapper class. The actual setting
>> of the mapper class is done further down in the submitJob method,
>> specifically the following piece:
>>
>>    job.setMapperClass(CsvToKeyValueMapper.class);
>>
>> There isn't currently a way to load a custom mapper in the
>> CsvBulkLoadTool, so the only (current) option is to create a fully new
>> custom implementation of the bulk load tool (probably copying or
>> reusing most of the existing tool). However, I can certainly imagine
>> this being a useful feature to have in some situations.
>>
>> Could you log this request in jira? It would also be really good to
>> have some more detail on your specific use case. And even better is a
>> patch that implements it :-)
>>
>> - Gabriel
>>
>>
>> On Thu, Oct 29, 2015 at 3:22 PM, Bulvik, Noam <[email protected]>
>> wrote:
>> > Hi,
>> >
>> >
>> >
>> > We have private logic to be executed when parsing each line before it is
>> > uploaded to phoenix. I saw the following in the code of the
>> CsvBulkLoadTool
>> >
>> > // Allow overriding the job jar setting by using a -D system property at
>> > startup
>> >
>> > if (job.getJar() == null)
>> >
>> >  {
>> >
>> >
>> > job.setJarByClass(CsvToKeyValueMapper.class);
>> >
>> >                  }
>> >
>> >
>> >
>> > Assuming I have the implementation for MyKeyValueMapper how can I make
>> sure
>> > it will be loaded instead of standard one ?
>> >
>> >
>> >
>> > Also in CsvToKeyValueMapper class there are some private members like
>> >
>> > ·         private PhoenixConnection conn;
>> >
>> > ·         private byte[] tableName;
>> >
>> >
>> >
>> > can you add option to access these member or make them protected so we
>> will
>> > be able to use them in the class we create that extends
>> CsvToKeyValueMapper
>> > and not to duplicate them and the code that init them
>> >
>> >
>> >
>> > we are using  phoenix 4.5.2 over CDH
>> >
>> >
>> >
>> > thanks
>> >
>> > Noam
>> >
>> >
>> >
>> > Noam Bulvik
>> >
>> > R&D Manager
>> >
>> >
>> >
>> > TEOCO CORPORATION
>> >
>> > c: +972 54 5507984
>> >
>> > p: +972 3 9269145
>> >
>> > [email protected]
>> >
>> > www.teoco.com
>> >
>> >
>> >
>> >
>> > ________________________________
>> >
>> > PRIVILEGED AND CONFIDENTIAL
>> > PLEASE NOTE: The information contained in this message is privileged and
>> > confidential, and is intended only for the use of the individual to
>> whom it
>> > is addressed and others who have been specifically authorized to
>> receive it.
>> > If you are not the intended recipient, you are hereby notified that any
>> > dissemination, distribution or copying of this communication is strictly
>> > prohibited. If you have received this communication in error, or if any
>> > problems occur with transmission, please contact sender. Thank you.
>>
>
>

Re: replace CsvToKeyValueMapper with my implementation

Reply via email to