[ 
https://issues.apache.org/jira/browse/FLINK-838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14084352#comment-14084352
 ] 

Artem Tsikiridis commented on FLINK-838:
----------------------------------------

Hello,

The approach suggested in the above comment seems nice and should solve the 
problem. It is also not hard to implement on the driver's side. However, I am 
not sure how would it be better to add this API hook to handle a {{Tuple3}} in 
a grouping-partitioning. What do you think?

here is a report of what is going on:

1) I can't seem to get right the  custom classloader to replace any 
{{JobClient}} with a {{FlinkJobClient}}. The thing is, that the classloader 
that loads the {{JobClient}} is not really the user's classloader for Flink but 
it's parent. I can't stop this delegation process and handle my case. Do you 
have any advices as I spent several days on this one (should have asked 
earlier) and I am still a bit stuck?

2) I have implemented all of the latest comments for the PR 
(https://github.com/apache/incubator-flink/pull/37#discussion_r15390287). The 
only thing I am unsure of is what to do when no number of slots has been set 
(can we really assume this is an IDE run ?). Added 3 more test cases: a test 
job where the reducer has different types, a map-only job without sorting (no 
reducers or combiners launched) and a test for {{MultipleInputs}} (it is 
supported with our current code, as the driver only deals with the product of 
this which is a {{DelegatingInputFormat}} - still an {{InputFormat}}.) I've 
also made a refactoring of {{FlinkHadoopJobClient}} so that we don't have to 
repeat code in a {{HadoopJobOperation}} (made a prototype). You can see it 
ASAP, as soon as we decide what should be done with the slot number in the case 
of the IDE run (today?).

3) I'm trying to wrap with the other features of the {{JobClient}}, {{JobConf}} 
and support as much as possible. It's finished. I must show you results in the 
next couple of days, as we have limited time and you'll probably have comments.

4) I have finished support for sorting (custom {{Comparators}}) a while ago.

There are 2 weeks left, we should make them count! :)

> GSoC Summer Project: Implement full Hadoop Compatibility Layer for 
> Stratosphere
> -------------------------------------------------------------------------------
>
>                 Key: FLINK-838
>                 URL: https://issues.apache.org/jira/browse/FLINK-838
>             Project: Flink
>          Issue Type: Improvement
>            Reporter: GitHub Import
>              Labels: github-import
>             Fix For: pre-apache
>
>
> This is a meta issue for tracking @atsikiridis progress with implementing a 
> full Hadoop Compatibliltiy Layer for Stratosphere.
> Some documentation can be found in the Wiki: 
> https://github.com/stratosphere/stratosphere/wiki/%5BGSoC-14%5D-A-Hadoop-abstraction-layer-for-Stratosphere-(Project-Map-and-Notes)
> As well as the project proposal: 
> https://github.com/stratosphere/stratosphere/wiki/GSoC-2014-Project-Proposal-Draft-by-Artem-Tsikiridis
> Most importantly, there is the following **schedule**:
> *19 May - 27 June (Midterm)*
> 1) Work on the Hadoop tasks, their Context and the mapping of Hadoop's 
> Configuration to the one of Stratosphere. By successfully bridging the Hadoop 
> tasks with Stratosphere, we already cover the most basic Hadoop Jobs. This 
> can be determined by running some popular Hadoop examples on Stratosphere 
> (e.g. WordCount, k-means, join) (4 - 5 weeks)
> 2) Understand how the running of these jobs works (e.g. command line 
> interface) for the wrapper. Implement how will the user run them. (1 - 2 
> weeks).
> *27 June - 11 August*
> 1) Continue wrapping more "advanced" Hadoop Interfaces (Comparators, 
> Partitioners, Distributed Cache etc.) There are quite a few interfaces and it 
> will be a challenge to support all of them. (5 full weeks)
> 2) Profiling of the application and optimizations (if applicable)
> *11 August - 18 August*
> Write documentation on code, write a README with care and add more 
> unit-tests. (1 week)
> ---------------- Imported from GitHub ----------------
> Url: https://github.com/stratosphere/stratosphere/issues/838
> Created by: [rmetzger|https://github.com/rmetzger]
> Labels: core, enhancement, parent-for-major-feature, 
> Milestone: Release 0.7 (unplanned)
> Created at: Tue May 20 10:11:34 CEST 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to