I think that it's not standard..usually you need to specify the table name but everything in *-site.xml files should be automatically load at runtime as for all other framework like mapreduce or spark. Don't you think?why Flink behaves differently within the configure method? On Nov 14, 2014 9:26 PM, "Fabian Hueske" <[email protected]> wrote:
> What exactly is required to configure the TableInputFormat? > Would it be easier and more flexible to just set the hostname of the HBase > master, the table name, etc, directly as strings in the InputFormat? > > 2014-11-14 15:34 GMT+01:00 Flavio Pompermaier <[email protected]>: > > > Both from shell with run command and from web client > > On Nov 14, 2014 2:32 PM, "Fabian Hueske" <[email protected]> wrote: > > > > > > In this case, the initialization happens when the InputFormat is > > > instantiated at the submission client and the Table info is serialized > as > > > part of the InputFormat and shipped out to all TaskManagers for > > execution. > > > However, if the initialization is done within configure it happens on > > each > > > TaskManager when initializing the InputFormat. > > > These are two separate JVMs in a distributed setting with different > > > classpaths. > > > > > > How do you submit your job for execution? > > > > > > 2014-11-14 13:58 GMT+01:00 Flavio Pompermaier <[email protected]>: > > > > > > > The strange thing us that everything works if I create HTable outside > > > > configure().. > > > > On Nov 14, 2014 10:32 AM, "Stephan Ewen" <[email protected]> wrote: > > > > > > > > > I think that this is a case where the wrong classloader is used: > > > > > > > > > > If the HBase classes are part of the flink lib directory, they are > > loaded > > > > > with the system class loader. When they look for anything in the > > > > classpath, > > > > > they will do so with the system classloader. > > > > > > > > > > You configuration is in the user code jar that you submit, so it is > > only > > > > > available through the user-code classloader. > > > > > > > > > > Any way you can load the configuration yourself and give that > > > > configuration > > > > > to HBase? > > > > > > > > > > Stephan > > > > > Am 13.11.2014 22:06 schrieb "Flavio Pompermaier" < > > [email protected] > > >: > > > > > > > > > > > The only config files available are within the submitted jar. > > Things > > > > > works > > > > > > in eclipse using local environment while fails deploying to the > > cluster > > > > > > On Nov 13, 2014 10:01 PM, <[email protected]> wrote: > > > > > > > > > > > > > Does the HBase jar in the lib folder contain a config that > could > > be > > > > > used > > > > > > > instead of the config in the job jar file? Or is simply no > config > > at > > > > > all > > > > > > > available when the configure method is called? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Fabian Hueske > > > > > > > Phone: +49 170 5549438 > > > > > > > Email: [email protected] > > > > > > > Web: http://www.user.tu-berlin.de/fabian.hueske > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From: Flavio Pompermaier > > > > > > > Sent: Thursday, 13. November, 2014 21:43 > > > > > > > To: [email protected] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The hbase jar is in the lib directory on each node while the > > config > > > > > files > > > > > > > are within the jar file I submit from the web client. > > > > > > > On Nov 13, 2014 9:37 PM, <[email protected]> wrote: > > > > > > > > > > > > > > > Have you added the hbase.jar file with your HBase config to > the > > > > ./lib > > > > > > > > folders of your Flink setup (JobManager, TaskManager) or is > it > > > > > bundled > > > > > > > with > > > > > > > > your job.jar file? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Fabian Hueske > > > > > > > > Phone: +49 170 5549438 > > > > > > > > Email: [email protected] > > > > > > > > Web: http://www.user.tu-berlin.de/fabian.hueske > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From: Flavio Pompermaier > > > > > > > > Sent: Thursday, 13. November, 2014 18:36 > > > > > > > > To: [email protected] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Any help with this? :( > > > > > > > > > > > > > > > > On Thu, Nov 13, 2014 at 2:06 PM, Flavio Pompermaier < > > > > > > > [email protected]> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > We definitely discovered that instantiating HTable and Scan > > in > > > > > > > > configure() > > > > > > > > > method of TableInputFormat causes problem in distributed > > > > > environment! > > > > > > > > > If you look at my implementation at > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/fpompermaier/incubator-flink/blob/master/flink-addons/flink-hbase/src/main/java/org/apache/flink/addons/hbase/TableInputFormat.java > > > > > > > > > you can see that Scan and HTable were made transient and > > > > recreated > > > > > > > within > > > > > > > > > configure but this causes HBaseConfiguration.create() to > fail > > > > > > searching > > > > > > > > for > > > > > > > > > classpath files...could you help us understanding why? > > > > > > > > > > > > > > > > > > On Wed, Nov 12, 2014 at 8:10 PM, Flavio Pompermaier < > > > > > > > > [email protected]> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > >> Usually, when I run a mapreduce job both on Spark and > Hadoop > > I > > > > > just > > > > > > > put > > > > > > > > >> *-site.xml files into the war I submit to the cluster and > > that's > > > > > > it. I > > > > > > > > >> think the problem appeared when I made the HTable a > private > > > > > > transient > > > > > > > > field > > > > > > > > >> and the table istantiation was moved in the configure > > method. > > > > > > > > >> Could it be a valid reason? we still have to make a deeper > > debug > > > > > but > > > > > > > I'm > > > > > > > > >> trying ro figure out where to investigate.. > > > > > > > > >> On Nov 12, 2014 8:03 PM, "Robert Metzger" < > > [email protected]> > > > > > > > wrote: > > > > > > > > >> > > > > > > > > >>> Hi, > > > > > > > > >>> Maybe its an issue with the classpath? As far as I know > is > > > > Hadoop > > > > > > > > reading > > > > > > > > >>> the configuration files from the classpath. Maybe is the > > > > > > > hbase-site.xml > > > > > > > > >>> file not accessible through the classpath when running on > > the > > > > > > > cluster? > > > > > > > > >>> > > > > > > > > >>> On Wed, Nov 12, 2014 at 7:40 PM, Flavio Pompermaier < > > > > > > > > >>> [email protected]> > > > > > > > > >>> wrote: > > > > > > > > >>> > > > > > > > > >>> > Today we tried tp execute a job on the cluster instead > of > > on > > > > > > local > > > > > > > > >>> executor > > > > > > > > >>> > and we faced the problem that the hbase-site.xml was > > > > basically > > > > > > > > >>> ignored. Is > > > > > > > > >>> > there a reason why the TableInputFormat is working > > correctly > > > > on > > > > > > > local > > > > > > > > >>> > environment while it doesn't on a cluster? > > > > > > > > >>> > On Nov 10, 2014 10:56 AM, "Fabian Hueske" < > > > > [email protected]> > > > > > > > > wrote: > > > > > > > > >>> > > > > > > > > > >>> > > I don't think we need to bundle the HBase input and > > output > > > > > > format > > > > > > > > in > > > > > > > > >>> a > > > > > > > > >>> > > single PR. > > > > > > > > >>> > > So, I think we can proceed with the IF only and > target > > the > > > > OF > > > > > > > > later. > > > > > > > > >>> > > However, the fix for Kryo should be in the master > > before > > > > > > merging > > > > > > > > the > > > > > > > > >>> PR. > > > > > > > > >>> > > Till is currently working on that and said he expects > > this > > > > to > > > > > > be > > > > > > > > >>> done by > > > > > > > > >>> > > end of the week. > > > > > > > > >>> > > > > > > > > > > >>> > > Cheers, Fabian > > > > > > > > >>> > > > > > > > > > > >>> > > > > > > > > > > >>> > > 2014-11-07 12:49 GMT+01:00 Flavio Pompermaier < > > > > > > > > [email protected] > > > > > > > > >>> >: > > > > > > > > >>> > > > > > > > > > > >>> > > > I fixed also the profile for Cloudera CDH5.1.3. You > > can > > > > > build > > > > > > > it > > > > > > > > >>> with > > > > > > > > >>> > the > > > > > > > > >>> > > > command: > > > > > > > > >>> > > > mvn clean install -Dmaven.test.skip=true > > > > > > > -Dhadoop.profile=2 > > > > > > > > >>> > > > -Pvendor-repos,cdh5.1.3 > > > > > > > > >>> > > > > > > > > > > > >>> > > > However, it would be good to generate the specific > > jar > > > > when > > > > > > > > >>> > > > releasing..(e.g. > > > > > > > > >>> > > > > > > > flink-addons:flink-hbase:0.8.0-hadoop2-cdh5.1.3-incubating) > > > > > > > > >>> > > > > > > > > > > > >>> > > > Best, > > > > > > > > >>> > > > Flavio > > > > > > > > >>> > > > > > > > > > > > >>> > > > On Fri, Nov 7, 2014 at 12:44 PM, Flavio > Pompermaier < > > > > > > > > >>> > > [email protected]> > > > > > > > > >>> > > > wrote: > > > > > > > > >>> > > > > > > > > > > > >>> > > > > I've just updated the code on my fork (synch with > > > > current > > > > > > > > master > > > > > > > > >>> and > > > > > > > > >>> > > > > applied improvements coming from comments on > > related > > > > PR). > > > > > > > > >>> > > > > I still have to understand how to write results > > back to > > > > > an > > > > > > > > HBase > > > > > > > > >>> > > > > Sink/OutputFormat... > > > > > > > > >>> > > > > > > > > > > > > >>> > > > > > > > > > > > > >>> > > > > On Mon, Nov 3, 2014 at 12:05 PM, Flavio > Pompermaier > > < > > > > > > > > >>> > > > [email protected]> > > > > > > > > >>> > > > > wrote: > > > > > > > > >>> > > > > > > > > > > > > >>> > > > >> Thanks for the detailed answer. So if I run a > job > > from > > > > > my > > > > > > > > >>> machine > > > > > > > > >>> > I'll > > > > > > > > >>> > > > >> have to download all the scanned data in a > > > > table..right? > > > > > > > > >>> > > > >> > > > > > > > > >>> > > > >> Always regarding the GenericTableOutputFormat it > > is > > > > not > > > > > > > clear > > > > > > > > >>> to me > > > > > > > > >>> > > how > > > > > > > > >>> > > > >> to proceed.. > > > > > > > > >>> > > > >> I saw in the hadoop compatibility addon that it > is > > > > > > possible > > > > > > > to > > > > > > > > >>> have > > > > > > > > >>> > > such > > > > > > > > >>> > > > >> compatibility using HBaseUtils class so the open > > > > method > > > > > > > should > > > > > > > > >>> > become > > > > > > > > >>> > > > >> something like: > > > > > > > > >>> > > > >> > > > > > > > > >>> > > > >> @Override > > > > > > > > >>> > > > >> public void open(int taskNumber, int numTasks) > > throws > > > > > > > > >>> IOException { > > > > > > > > >>> > > > >> if (Integer.toString(taskNumber + 1).length() > > 6) > > { > > > > > > > > >>> > > > >> throw new IOException("Task id too large."); > > > > > > > > >>> > > > >> } > > > > > > > > >>> > > > >> TaskAttemptID taskAttemptID = > > > > > > > > >>> > TaskAttemptID.forName("attempt__0000_r_" > > > > > > > > >>> > > > >> + String.format("%" + (6 - > > > > Integer.toString(taskNumber + > > > > > > > > >>> > 1).length()) > > > > > > > > >>> > > + > > > > > > > > >>> > > > >> "s"," ").replace(" ", "0") > > > > > > > > >>> > > > >> + Integer.toString(taskNumber + 1) > > > > > > > > >>> > > > >> + "_0"); > > > > > > > > >>> > > > >> this.configuration.set("mapred.task.id", > > > > > > > > >>> > taskAttemptID.toString()); > > > > > > > > >>> > > > >> > this.configuration.setInt("mapred.task.partition", > > > > > > > taskNumber > > > > > > > > + > > > > > > > > >>> 1); > > > > > > > > >>> > > > >> // for hadoop 2.2 > > > > > > > > >>> > > > >> this.configuration.set(" > mapreduce.task.attempt.id > > ", > > > > > > > > >>> > > > >> taskAttemptID.toString()); > > > > > > > > >>> > > > >> > > this.configuration.setInt("mapreduce.task.partition", > > > > > > > > >>> taskNumber + > > > > > > > > >>> > 1); > > > > > > > > >>> > > > >> try { > > > > > > > > >>> > > > >> this.context = > > > > > > > > >>> > > > >> > > > > > > > HadoopUtils.instantiateTaskAttemptContext(this.configuration, > > > > > > > > >>> > > > >> taskAttemptID); > > > > > > > > >>> > > > >> } catch (Exception e) { > > > > > > > > >>> > > > >> throw new RuntimeException(e); > > > > > > > > >>> > > > >> } > > > > > > > > >>> > > > >> final HFileOutputFormat2 outFormat = new > > > > > > > HFileOutputFormat2(); > > > > > > > > >>> > > > >> try { > > > > > > > > >>> > > > >> this.writer = > > outFormat.getRecordWriter(this.context); > > > > > > > > >>> > > > >> } catch (InterruptedException iex) { > > > > > > > > >>> > > > >> throw new IOException("Opening the writer was > > > > > > interrupted.", > > > > > > > > >>> iex); > > > > > > > > >>> > > > >> } > > > > > > > > >>> > > > >> } > > > > > > > > >>> > > > >> > > > > > > > > >>> > > > >> But I'm not sure about how to pass the JobConf > to > > the > > > > > > class, > > > > > > > > if > > > > > > > > >>> to > > > > > > > > >>> > > merge > > > > > > > > >>> > > > >> config fileas, where HFileOutputFormat2 writes > the > > > > data > > > > > > and > > > > > > > > how > > > > > > > > >>> to > > > > > > > > >>> > > > >> implement the public void writeRecord(Record > > record) > > > > > API. > > > > > > > > >>> > > > >> Could I do a little chat off the mailing list > with > > the > > > > > > > > >>> implementor > > > > > > > > >>> > of > > > > > > > > >>> > > > >> this extension? > > > > > > > > >>> > > > >> > > > > > > > > >>> > > > >> On Mon, Nov 3, 2014 at 11:51 AM, Fabian Hueske < > > > > > > > > >>> [email protected]> > > > > > > > > >>> > > > >> wrote: > > > > > > > > >>> > > > >> > > > > > > > > >>> > > > >>> Hi Flavio > > > > > > > > >>> > > > >>> > > > > > > > > >>> > > > >>> let me try to answer your last question on the > > user's > > > > > > list > > > > > > > > (to > > > > > > > > >>> the > > > > > > > > >>> > > best > > > > > > > > >>> > > > >>> of > > > > > > > > >>> > > > >>> my HBase knowledge). > > > > > > > > >>> > > > >>> "I just wanted to known if and how regiom > > splitting > > > > is > > > > > > > > >>> handled. Can > > > > > > > > >>> > > you > > > > > > > > >>> > > > >>> explain me in detail how Flink and HBase > > works?what > > > > is > > > > > > not > > > > > > > > >>> fully > > > > > > > > >>> > > clear > > > > > > > > >>> > > > to > > > > > > > > >>> > > > >>> me is when computation is done by region > servers > > and > > > > > when > > > > > > > > data > > > > > > > > >>> > start > > > > > > > > >>> > > > flow > > > > > > > > >>> > > > >>> to a Flink worker (that in ky test job is only > my > > pc) > > > > > and > > > > > > > how > > > > > > > > >>> ro > > > > > > > > >>> > > > >>> undertsand > > > > > > > > >>> > > > >>> better the important logged info to understand > if > > my > > > > > job > > > > > > is > > > > > > > > >>> > > performing > > > > > > > > >>> > > > >>> well" > > > > > > > > >>> > > > >>> > > > > > > > > >>> > > > >>> HBase partitions its tables into so called > > "regions" > > > > of > > > > > > > keys > > > > > > > > >>> and > > > > > > > > >>> > > stores > > > > > > > > >>> > > > >>> the > > > > > > > > >>> > > > >>> regions distributed in the cluster using HDFS. > I > > > > think > > > > > an > > > > > > > > HBase > > > > > > > > >>> > > region > > > > > > > > >>> > > > >>> can > > > > > > > > >>> > > > >>> be thought of as a HDFS block. To make reading > an > > > > HBase > > > > > > > table > > > > > > > > >>> > > > efficient, > > > > > > > > >>> > > > >>> region reads should be locally done, i.e., an > > > > > InputFormat > > > > > > > > >>> should > > > > > > > > >>> > > > >>> primarily > > > > > > > > >>> > > > >>> read region that are stored on the same machine > > as > > > > the > > > > > IF > > > > > > > is > > > > > > > > >>> > running > > > > > > > > >>> > > > on. > > > > > > > > >>> > > > >>> Flink's InputSplits partition the HBase input > by > > > > > regions > > > > > > > and > > > > > > > > >>> add > > > > > > > > >>> > > > >>> information about the storage location of the > > region. > > > > > > > During > > > > > > > > >>> > > execution, > > > > > > > > >>> > > > >>> input splits are assigned to InputFormats that > > can do > > > > > > local > > > > > > > > >>> reads. > > > > > > > > >>> > > > >>> > > > > > > > > >>> > > > >>> Best, Fabian > > > > > > > > >>> > > > >>> > > > > > > > > >>> > > > >>> 2014-11-03 11:13 GMT+01:00 Stephan Ewen < > > > > > > [email protected] > > > > > > > >: > > > > > > > > >>> > > > >>> > > > > > > > > >>> > > > >>> > Hi! > > > > > > > > >>> > > > >>> > > > > > > > > > >>> > > > >>> > The way of passing parameters through the > > > > > configuration > > > > > > > is > > > > > > > > >>> very > > > > > > > > >>> > old > > > > > > > > >>> > > > >>> (the > > > > > > > > >>> > > > >>> > original HBase format dated back to that > time). > > I > > > > > would > > > > > > > > >>> simply > > > > > > > > >>> > make > > > > > > > > >>> > > > the > > > > > > > > >>> > > > >>> > HBase format take those parameters through > the > > > > > > > constructor. > > > > > > > > >>> > > > >>> > > > > > > > > > >>> > > > >>> > Greetings, > > > > > > > > >>> > > > >>> > Stephan > > > > > > > > >>> > > > >>> > > > > > > > > > >>> > > > >>> > > > > > > > > > >>> > > > >>> > On Mon, Nov 3, 2014 at 10:59 AM, Flavio > > > > Pompermaier < > > > > > > > > >>> > > > >>> [email protected]> > > > > > > > > >>> > > > >>> > wrote: > > > > > > > > >>> > > > >>> > > > > > > > > > >>> > > > >>> > > The problem is that I also removed the > > > > > > > > >>> GenericTableOutputFormat > > > > > > > > >>> > > > >>> because > > > > > > > > >>> > > > >>> > > there is an incompatibility between hadoop1 > > and > > > > > > hadoop2 > > > > > > > > for > > > > > > > > >>> > class > > > > > > > > >>> > > > >>> > > TaskAttemptContext and > > TaskAttemptContextImpl.. > > > > > > > > >>> > > > >>> > > then it would be nice if the user doesn't > > have to > > > > > > worry > > > > > > > > >>> about > > > > > > > > >>> > > > passing > > > > > > > > >>> > > > >>> > > pact.hbase.jtkey and pact.job.id > > parameters.. > > > > > > > > >>> > > > >>> > > I think it is probably a good idea to > remove > > > > > hadoop1 > > > > > > > > >>> > > compatibility > > > > > > > > >>> > > > >>> and > > > > > > > > >>> > > > >>> > keep > > > > > > > > >>> > > > >>> > > enable HBase addon only for hadoop2 (as > > before) > > > > and > > > > > > > > decide > > > > > > > > >>> how > > > > > > > > >>> > to > > > > > > > > >>> > > > >>> mange > > > > > > > > >>> > > > >>> > > those 2 parameters.. > > > > > > > > >>> > > > >>> > > > > > > > > > > >>> > > > >>> > > On Mon, Nov 3, 2014 at 10:19 AM, Stephan > Ewen > > < > > > > > > > > >>> > [email protected]> > > > > > > > > >>> > > > >>> wrote: > > > > > > > > >>> > > > >>> > > > > > > > > > > >>> > > > >>> > > > It is fine to remove it, in my opinion. > > > > > > > > >>> > > > >>> > > > > > > > > > > > >>> > > > >>> > > > On Mon, Nov 3, 2014 at 10:11 AM, Flavio > > > > > > Pompermaier < > > > > > > > > >>> > > > >>> > > [email protected]> > > > > > > > > >>> > > > >>> > > > wrote: > > > > > > > > >>> > > > >>> > > > > > > > > > > > >>> > > > >>> > > > > That is one class I removed because it > > was > > > > > using > > > > > > > the > > > > > > > > >>> > > deprecated > > > > > > > > >>> > > > >>> API > > > > > > > > >>> > > > >>> > > > > GenericDataSink..I can restore them but > > the > > > > it > > > > > > will > > > > > > > > be > > > > > > > > >>> a > > > > > > > > >>> > good > > > > > > > > >>> > > > >>> idea to > > > > > > > > >>> > > > >>> > > > > remove those warning (also because from > > what > > > > I > > > > > > > > >>> understood > > > > > > > > >>> > the > > > > > > > > >>> > > > >>> Record > > > > > > > > >>> > > > >>> > > APIs > > > > > > > > >>> > > > >>> > > > > are going to be removed). > > > > > > > > >>> > > > >>> > > > > > > > > > > > > >>> > > > >>> > > > > On Mon, Nov 3, 2014 at 9:51 AM, Fabian > > > > Hueske < > > > > > > > > >>> > > > >>> [email protected]> > > > > > > > > >>> > > > >>> > > > wrote: > > > > > > > > >>> > > > >>> > > > > > > > > > > > > >>> > > > >>> > > > > > I'm not familiar with the HBase > > connector > > > > > code, > > > > > > > but > > > > > > > > >>> are > > > > > > > > >>> > you > > > > > > > > >>> > > > >>> maybe > > > > > > > > >>> > > > >>> > > > looking > > > > > > > > >>> > > > >>> > > > > > for the GenericTableOutputFormat? > > > > > > > > >>> > > > >>> > > > > > > > > > > > > > >>> > > > >>> > > > > > 2014-11-03 9:44 GMT+01:00 Flavio > > > > Pompermaier > > > > > < > > > > > > > > >>> > > > >>> [email protected] > > > > > > > > >>> > > > >>> > >: > > > > > > > > >>> > > > >>> > > > > > > > > > > > > > >>> > > > >>> > > > > > > | was trying to modify the example > > > > setting > > > > > > > > >>> > > > hbaseDs.output(new > > > > > > > > >>> > > > >>> > > > > > > HBaseOutputFormat()); but I can't > see > > any > > > > > > > > >>> > > HBaseOutputFormat > > > > > > > > >>> > > > >>> > > > > class..maybe > > > > > > > > >>> > > > >>> > > > > > we > > > > > > > > >>> > > > >>> > > > > > > shall use another class? > > > > > > > > >>> > > > >>> > > > > > > > > > > > > > > >>> > > > >>> > > > > > > On Mon, Nov 3, 2014 at 9:39 AM, > > Flavio > > > > > > > > Pompermaier > > > > > > > > >>> < > > > > > > > > >>> > > > >>> > > > > [email protected] > > > > > > > > >>> > > > >>> > > > > > > > > > > > > > > >>> > > > >>> > > > > > > wrote: > > > > > > > > >>> > > > >>> > > > > > > > > > > > > > > >>> > > > >>> > > > > > > > Maybe that's something I could > add > > to > > > > the > > > > > > > HBase > > > > > > > > >>> > example > > > > > > > > >>> > > > and > > > > > > > > >>> > > > >>> > that > > > > > > > > >>> > > > >>> > > > > could > > > > > > > > >>> > > > >>> > > > > > be > > > > > > > > >>> > > > >>> > > > > > > > better documented in the Wiki. > > > > > > > > >>> > > > >>> > > > > > > > > > > > > > > > >>> > > > >>> > > > > > > > Since we're talking about the > > wiki..I > > > > was > > > > > > > > >>> looking at > > > > > > > > >>> > > the > > > > > > > > >>> > > > >>> Java > > > > > > > > >>> > > > >>> > > API ( > > > > > > > > >>> > > > >>> > > > > > > > > > > > > > > > >>> > > > >>> > > > > > > > > > > > > > > >>> > > > >>> > > > > > > > > > > > > > >>> > > > >>> > > > > > > > > > > > > >>> > > > >>> > > > > > > > > > > > >>> > > > >>> > > > > > > > > > > >>> > > > >>> > > > > > > > > > >>> > > > >>> > > > > > > > > >>> > > > > > > > > > > > >>> > > > > > > > > > > >>> > > > > > > > > > >>> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > http://flink.incubator.apache.org/docs/0.6-incubating/java_api_guide.html > > > > > > > > >>> > > > >>> ) > > > > > > > > >>> > > > >>> > > > > > > > and the link to the KMeans > example > > is > > > > not > > > > > > > > working > > > > > > > > >>> > > (where > > > > > > > > >>> > > > it > > > > > > > > >>> > > > >>> > says > > > > > > > > >>> > > > >>> > > > For > > > > > > > > >>> > > > >>> > > > > a > > > > > > > > >>> > > > >>> > > > > > > > complete example program, have a > > look > > > > at > > > > > > > KMeans > > > > > > > > >>> > > > Algorithm). > > > > > > > > >>> > > > >>> > > > > > > > > > > > > > > > >>> > > > >>> > > > > > > > Best, > > > > > > > > >>> > > > >>> > > > > > > > Flavio > > > > > > > > >>> > > > >>> > > > > > > > > > > > > > > > >>> > > > >>> > > > > > > > > > > > > > > > >>> > > > >>> > > > > > > > On Mon, Nov 3, 2014 at 9:12 AM, > > Flavio > > > > > > > > >>> Pompermaier < > > > > > > > > >>> > > > >>> > > > > > [email protected] > > > > > > > > >>> > > > >>> > > > > > > > > > > > > > > > >>> > > > >>> > > > > > > > wrote: > > > > > > > > >>> > > > >>> > > > > > > > > > > > > > > > >>> > > > >>> > > > > > > >> Ah ok, perfect! That was the > > reason > > > > why > > > > > I > > > > > > > > >>> removed it > > > > > > > > >>> > > :) > > > > > > > > >>> > > > >>> > > > > > > >> > > > > > > > > >>> > > > >>> > > > > > > >> On Mon, Nov 3, 2014 at 9:10 AM, > > > > Stephan > > > > > > > Ewen < > > > > > > > > >>> > > > >>> > [email protected]> > > > > > > > > >>> > > > >>> > > > > > wrote: > > > > > > > > >>> > > > >>> > > > > > > >> > > > > > > > > >>> > > > >>> > > > > > > >>> You do not really need a HBase > > data > > > > > sink. > > > > > > > You > > > > > > > > >>> can > > > > > > > > >>> > > call > > > > > > > > >>> > > > >>> > > > > > > >>> "DataSet.output(new > > > > > > > > >>> > > > >>> > > > > > > >>> HBaseOutputFormat())" > > > > > > > > >>> > > > >>> > > > > > > >>> > > > > > > > > >>> > > > >>> > > > > > > >>> Stephan > > > > > > > > >>> > > > >>> > > > > > > >>> Am 02.11.2014 23:05 schrieb > > "Flavio > > > > > > > > >>> Pompermaier" < > > > > > > > > >>> > > > >>> > > > > > [email protected] > > > > > > > > >>> > > > >>> > > > > > > >: > > > > > > > > >>> > > > >>> > > > > > > >>> > > > > > > > > >>> > > > >>> > > > > > > >>> > Just one last thing..I > removed > > the > > > > > > > > >>> HbaseDataSink > > > > > > > > >>> > > > >>> because I > > > > > > > > >>> > > > >>> > > > think > > > > > > > > >>> > > > >>> > > > > it > > > > > > > > >>> > > > >>> > > > > > > was > > > > > > > > >>> > > > >>> > > > > > > >>> > using the old APIs..can > someone > > > > help > > > > > me > > > > > > > in > > > > > > > > >>> > updating > > > > > > > > >>> > > > >>> that > > > > > > > > >>> > > > >>> > > class? > > > > > > > > >>> > > > >>> > > > > > > >>> > > > > > > > > > >>> > > > >>> > > > > > > >>> > On Sun, Nov 2, 2014 at 10:55 > > AM, > > > > > Flavio > > > > > > > > >>> > > Pompermaier < > > > > > > > > >>> > > > >>> > > > > > > >>> [email protected]> > > > > > > > > >>> > > > >>> > > > > > > >>> > wrote: > > > > > > > > >>> > > > >>> > > > > > > >>> > > > > > > > > > >>> > > > >>> > > > > > > >>> > > Indeed this time the build > > has > > > > been > > > > > > > > >>> successful > > > > > > > > >>> > :) > > > > > > > > >>> > > > >>> > > > > > > >>> > > > > > > > > > > >>> > > > >>> > > > > > > >>> > > On Sun, Nov 2, 2014 at > 10:29 > > AM, > > > > > > Fabian > > > > > > > > >>> Hueske > > > > > > > > >>> > < > > > > > > > > >>> > > > >>> > > > > > [email protected] > > > > > > > > >>> > > > >>> > > > > > > > > > > > > > > > >>> > > > >>> > > > > > > >>> > wrote: > > > > > > > > >>> > > > >>> > > > > > > >>> > > > > > > > > > > >>> > > > >>> > > > > > > >>> > >> You can also setup Travis > to > > > > build > > > > > > > your > > > > > > > > >>> own > > > > > > > > >>> > > Github > > > > > > > > >>> > > > >>> > > > > repositories > > > > > > > > >>> > > > >>> > > > > > by > > > > > > > > >>> > > > >>> > > > > > > >>> > linking > > > > > > > > >>> > > > >>> > > > > > > >>> > >> it to your Github account. > > That > > > > > way > > > > > > > > >>> Travis can > > > > > > > > >>> > > > >>> build all > > > > > > > > >>> > > > >>> > > > your > > > > > > > > >>> > > > >>> > > > > > > >>> branches > > > > > > > > >>> > > > >>> > > > > > > >>> > >> (and > > > > > > > > >>> > > > >>> > > > > > > >>> > >> you can also trigger > > rebuilds if > > > > > > > > something > > > > > > > > >>> > > fails). > > > > > > > > >>> > > > >>> > > > > > > >>> > >> Not sure if we can > manually > > > > > trigger > > > > > > > > >>> retrigger > > > > > > > > >>> > > > >>> builds on > > > > > > > > >>> > > > >>> > > the > > > > > > > > >>> > > > >>> > > > > > Apache > > > > > > > > >>> > > > >>> > > > > > > >>> > >> repository. > > > > > > > > >>> > > > >>> > > > > > > >>> > >> > > > > > > > > >>> > > > >>> > > > > > > >>> > >> Support for Hadoop 1 and 2 > > is > > > > > > indeed a > > > > > > > > >>> very > > > > > > > > >>> > good > > > > > > > > >>> > > > >>> > addition > > > > > > > > >>> > > > >>> > > > :-) > > > > > > > > >>> > > > >>> > > > > > > >>> > >> > > > > > > > > >>> > > > >>> > > > > > > >>> > >> For the discusion about > the > > PR > > > > > > > itself, I > > > > > > > > >>> would > > > > > > > > >>> > > > need > > > > > > > > >>> > > > >>> a > > > > > > > > >>> > > > >>> > bit > > > > > > > > >>> > > > >>> > > > more > > > > > > > > >>> > > > >>> > > > > > > time > > > > > > > > >>> > > > >>> > > > > > > >>> to > > > > > > > > >>> > > > >>> > > > > > > >>> > >> become more familiar with > > > > HBase. I > > > > > > do > > > > > > > > >>> also not > > > > > > > > >>> > > > have > > > > > > > > >>> > > > >>> a > > > > > > > > >>> > > > >>> > > HBase > > > > > > > > >>> > > > >>> > > > > > setup > > > > > > > > >>> > > > >>> > > > > > > >>> > >> available > > > > > > > > >>> > > > >>> > > > > > > >>> > >> here. > > > > > > > > >>> > > > >>> > > > > > > >>> > >> Maybe somebody else of the > > > > > community > > > > > > > who > > > > > > > > >>> was > > > > > > > > >>> > > > >>> involved > > > > > > > > >>> > > > >>> > > with a > > > > > > > > >>> > > > >>> > > > > > > >>> previous > > > > > > > > >>> > > > >>> > > > > > > >>> > >> version of the HBase > > connector > > > > > could > > > > > > > > >>> comment > > > > > > > > >>> > on > > > > > > > > >>> > > > your > > > > > > > > >>> > > > >>> > > > question. > > > > > > > > >>> > > > >>> > > > > > > >>> > >> > > > > > > > > >>> > > > >>> > > > > > > >>> > >> Best, Fabian > > > > > > > > >>> > > > >>> > > > > > > >>> > >> > > > > > > > > >>> > > > >>> > > > > > > >>> > >> 2014-11-02 9:57 GMT+01:00 > > Flavio > > > > > > > > >>> Pompermaier < > > > > > > > > >>> > > > >>> > > > > > > [email protected] > > > > > > > > >>> > > > >>> > > > > > > >>> >: > > > > > > > > >>> > > > >>> > > > > > > >>> > >> > > > > > > > > >>> > > > >>> > > > > > > >>> > >> > As suggestes by Fabian I > > moved > > > > > the > > > > > > > > >>> > discussion > > > > > > > > >>> > > on > > > > > > > > >>> > > > >>> this > > > > > > > > >>> > > > >>> > > > > mailing > > > > > > > > >>> > > > >>> > > > > > > >>> list. > > > > > > > > >>> > > > >>> > > > > > > >>> > >> > > > > > > > > > >>> > > > >>> > > > > > > >>> > >> > I think that what is > still > > to > > > > be > > > > > > > > >>> discussed > > > > > > > > >>> > is > > > > > > > > >>> > > > >>> how to > > > > > > > > >>> > > > >>> > > > > > retrigger > > > > > > > > >>> > > > >>> > > > > > > >>> the > > > > > > > > >>> > > > >>> > > > > > > >>> > >> build > > > > > > > > >>> > > > >>> > > > > > > >>> > >> > on Travis (I don't have > an > > > > > > account) > > > > > > > > and > > > > > > > > >>> if > > > > > > > > >>> > the > > > > > > > > >>> > > > PR > > > > > > > > >>> > > > >>> can > > > > > > > > >>> > > > >>> > be > > > > > > > > >>> > > > >>> > > > > > > >>> integrated. > > > > > > > > >>> > > > >>> > > > > > > >>> > >> > > > > > > > > > >>> > > > >>> > > > > > > >>> > >> > Maybe what I can do is > to > > move > > > > > the > > > > > > > > HBase > > > > > > > > >>> > > example > > > > > > > > >>> > > > >>> in > > > > > > > > >>> > > > >>> > the > > > > > > > > >>> > > > >>> > > > test > > > > > > > > >>> > > > >>> > > > > > > >>> package > > > > > > > > >>> > > > >>> > > > > > > >>> > >> (right > > > > > > > > >>> > > > >>> > > > > > > >>> > >> > now I left it in the > main > > > > > folder) > > > > > > so > > > > > > > > it > > > > > > > > >>> will > > > > > > > > >>> > > > force > > > > > > > > >>> > > > >>> > > Travis > > > > > > > > >>> > > > >>> > > > to > > > > > > > > >>> > > > >>> > > > > > > >>> rebuild. > > > > > > > > >>> > > > >>> > > > > > > >>> > >> > I'll do it within a > couple > > of > > > > > > hours. > > > > > > > > >>> > > > >>> > > > > > > >>> > >> > > > > > > > > > >>> > > > >>> > > > > > > >>> > >> > Another thing I forgot > to > > say > > > > is > > > > > > > that > > > > > > > > >>> the > > > > > > > > >>> > > hbase > > > > > > > > >>> > > > >>> > > extension > > > > > > > > >>> > > > >>> > > > is > > > > > > > > >>> > > > >>> > > > > > now > > > > > > > > >>> > > > >>> > > > > > > >>> > >> compatible > > > > > > > > >>> > > > >>> > > > > > > >>> > >> > with both hadoop 1 and > 2. > > > > > > > > >>> > > > >>> > > > > > > >>> > >> > > > > > > > > > >>> > > > >>> > > > > > > >>> > >> > Best, > > > > > > > > >>> > > > >>> > > > > > > >>> > >> > Flavio > > > > > > > > >>> > > > >>> > > > > > > >>> > >> > > > > > > > > >>> > > > >>> > > > > > > >>> > > > > > > > > > > >>> > > > >>> > > > > > > >>> > > > > > > > > > >>> > > > >>> > > > > > > >>> > > > > > > > > >>> > > > >>> > > > > > > >> > > > > > > > > >>> > > > >>> > > > > > > > > > > > > > > > >>> > > > >>> > > > > > > > > > > > > > > >>> > > > >>> > > > > > > > > > > > > > >>> > > > >>> > > > > > > > > > > > > >>> > > > >>> > > > > > > > > > > > >>> > > > >>> > > > > > > > > > > >>> > > > >>> > > > > > > > > > >>> > > > >>> > > > > > > > > >>> > > > >> > > > > > > > > >>> > > > >> > > > > > > > > >>> > > > >> > > > > > > > > >>> > > > > > > > > > > > > >>> > > > > > > > > > > > >>> > > > > > > > > > > >>> > > > > > > > > > >>> > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
