I seem to remember you starting down that path, Gabriel - a kind of pluggable transformation for each row. It wasn't pluggable on the input format, but that's a nice idea too, Ravi. I'm not sure if this is what Noam needs or if it's something else.
Probably good to discuss a bit more at the use case level to understand the specifics a bit more. On Thu, Oct 29, 2015 at 9:17 AM, Ravi Kiran <[email protected]> wrote: > It would be great if we can provide an api and have end users provided > implementation on how to parse each record . This way, we can move away > with only bulk loading csv and have json and other formats of input bulk > loaded onto phoenix tables. > > I can take that one up. Would it be something the community like as a > feature ? > > > > > > On Thu, Oct 29, 2015 at 8:10 AM, Gabriel Reid <[email protected]> > wrote: > >> Hi Noam, >> >> That specific piece of code in CsvBulkLoadTool that you referred to >> allows packaging the CsvBulkLoadTool within a different job jar file, >> but won't allow setting a different mapper class. The actual setting >> of the mapper class is done further down in the submitJob method, >> specifically the following piece: >> >> job.setMapperClass(CsvToKeyValueMapper.class); >> >> There isn't currently a way to load a custom mapper in the >> CsvBulkLoadTool, so the only (current) option is to create a fully new >> custom implementation of the bulk load tool (probably copying or >> reusing most of the existing tool). However, I can certainly imagine >> this being a useful feature to have in some situations. >> >> Could you log this request in jira? It would also be really good to >> have some more detail on your specific use case. And even better is a >> patch that implements it :-) >> >> - Gabriel >> >> >> On Thu, Oct 29, 2015 at 3:22 PM, Bulvik, Noam <[email protected]> >> wrote: >> > Hi, >> > >> > >> > >> > We have private logic to be executed when parsing each line before it is >> > uploaded to phoenix. I saw the following in the code of the >> CsvBulkLoadTool >> > >> > // Allow overriding the job jar setting by using a -D system property at >> > startup >> > >> > if (job.getJar() == null) >> > >> > { >> > >> > >> > job.setJarByClass(CsvToKeyValueMapper.class); >> > >> > } >> > >> > >> > >> > Assuming I have the implementation for MyKeyValueMapper how can I make >> sure >> > it will be loaded instead of standard one ? >> > >> > >> > >> > Also in CsvToKeyValueMapper class there are some private members like >> > >> > · private PhoenixConnection conn; >> > >> > · private byte[] tableName; >> > >> > >> > >> > can you add option to access these member or make them protected so we >> will >> > be able to use them in the class we create that extends >> CsvToKeyValueMapper >> > and not to duplicate them and the code that init them >> > >> > >> > >> > we are using phoenix 4.5.2 over CDH >> > >> > >> > >> > thanks >> > >> > Noam >> > >> > >> > >> > Noam Bulvik >> > >> > R&D Manager >> > >> > >> > >> > TEOCO CORPORATION >> > >> > c: +972 54 5507984 >> > >> > p: +972 3 9269145 >> > >> > [email protected] >> > >> > www.teoco.com >> > >> > >> > >> > >> > ________________________________ >> > >> > PRIVILEGED AND CONFIDENTIAL >> > PLEASE NOTE: The information contained in this message is privileged and >> > confidential, and is intended only for the use of the individual to >> whom it >> > is addressed and others who have been specifically authorized to >> receive it. >> > If you are not the intended recipient, you are hereby notified that any >> > dissemination, distribution or copying of this communication is strictly >> > prohibited. If you have received this communication in error, or if any >> > problems occur with transmission, please contact sender. Thank you. >> > >
