Gabriel, Thanks for your response. My current plan is to implement the bulk load using scalding via jdbc. I have not played with Pig, but, my guess is my scalding solution will achieve comparable performance.
I haven't done a performance test yet, but, if it turns out that loading via jdbc is too slow, I would need to generate the HFiles. I would be interested in your thoughts on how you'd approach generating hfiles. Would you extend the csv bulk loader? How would you represent dynamic columns in a csv? A general solution is also further complicated by the fact that a dynamic column may have heterogeneous types. -Bob On Thursday, October 16, 2014 12:24 AM, Gabriel Reid <[email protected]> wrote: Hi Bob, No, there currently isn't any support for bulk loading dynamic columns. I think that this would (in theory) be as simple as supplying a custom upsert statement to the bulk loader or PhoenixHBaseStorage (if you're using Pig), so it probably wouldn't be too tricky to implement. If you're interested in having something like this in Phoenix, could you log a ticket for it at https://issues.apache.org/jira/browse/PHOENIX? If you're interested in taking a crack at implementing it as well, feel free (as well as feeling free to ask for advice on how to go about it). - Gabriel On Thu, Oct 16, 2014 at 7:58 AM, Bob Dole <[email protected]> wrote: > Is there any existing support perform bulk loading with dynamic columns? > > Thanks!
