Do you think is that complex to support it? I think we can try to implement it if someone could give us some support (at least some big picture)
On Tue, Jan 16, 2018 at 10:02 AM, Fabian Hueske <fhue...@gmail.com> wrote: > No, I'm not aware of anybody working on extending the Hadoop compatibility > support. > I'll also have no time to work on this any time soon :-( > > 2018-01-13 1:34 GMT+01:00 Flavio Pompermaier <pomperma...@okkam.it>: > >> Any progress on this Fabian? HBase bulk loading is a common task for us >> and it's very annoying and uncomfortable to run a separate YARN job to >> accomplish it... >> >> On 10 Apr 2015 12:26, "Flavio Pompermaier" <pomperma...@okkam.it> wrote: >> >> Great! That will be awesome. >> Thank you Fabian >> >> On Fri, Apr 10, 2015 at 12:14 PM, Fabian Hueske <fhue...@gmail.com> >> wrote: >> >>> Hmm, that's a tricky question ;-) I would need to have a closer look. >>> But getting custom comparators for sorting and grouping into the Combiner >>> is not that trivial because it touches API, Optimizer, and Runtime code. >>> However, I did that before for the Reducer and with the recent addition of >>> groupCombine the Reducer changes might be just applied to combine. >>> >>> I'll be gone next week, but if you want to, we can have a closer look at >>> the problem after that. >>> >>> 2015-04-10 12:07 GMT+02:00 Flavio Pompermaier <pomperma...@okkam.it>: >>> >>>> I think I could also take care of it if somebody can help me and guide >>>> me a little bit.. >>>> How long do you think it will require to complete such a task? >>>> >>>> On Fri, Apr 10, 2015 at 12:02 PM, Fabian Hueske <fhue...@gmail.com> >>>> wrote: >>>> >>>>> We had an effort to execute any HadoopMR program by simply specifying >>>>> the JobConf and execute it (even embedded in regular Flink programs). >>>>> We got quite far but did not complete (counters and custom grouping / >>>>> sorting functions for Combiners are missing if I remember correctly). >>>>> I don't think that anybody is working on that right now, but it would >>>>> definitely be a cool feature. >>>>> >>>>> 2015-04-10 11:55 GMT+02:00 Flavio Pompermaier <pomperma...@okkam.it>: >>>>> >>>>>> Hi guys, >>>>>> >>>>>> I have a nice question about Hadoop compatibility. >>>>>> In https://flink.apache.org/news/2014/11/18/hadoop-compatibility.html >>>>>> you say that you can reuse existing mapreduce programs. >>>>>> Could it be possible to manage also complex mapreduce programs like >>>>>> HBase BulkImport that use for example a custom partioner >>>>>> (org.apache.hadoop.mapreduce.Partitioner)? >>>>>> >>>>>> In the bulk-import examples the call >>>>>> HFileOutputFormat2.configureIncrementalLoadMap >>>>>> that sets a series of job parameters (like partitioner, mapper, reducers, >>>>>> etc) -> http://pastebin.com/8VXjYAEf. >>>>>> The full code of it can be seen at https://github.com/apache/h >>>>>> base/blob/master/hbase-server/src/main/java/org/apache/hadoo >>>>>> p/hbase/mapreduce/HFileOutputFormat2.java. >>>>>> >>>>>> Do you think there's any change to make it run in flink? >>>>>> >>>>>> Best, >>>>>> Flavio >>>>>> >>>>> >>>>> >>>> >>> >> >> > -- Flavio Pompermaier Development Department OKKAM S.r.l. Tel. +(39) 0461 041809 <+39%200461%20041809>