[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-02-15 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
Thanks to all @fhueske , @tonycox  and @wuchong for helping in getting this 
in.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-02-14 Thread fhueske
Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/3149
  
Thanks Robert! 
I'll let the test run again and will merge if everything passes. Thanks 
@tonycox and @ramkrish86 for digging into this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-02-14 Thread rmetzger
Github user rmetzger commented on the issue:

https://github.com/apache/flink/pull/3149
  
I think its fine to change the JVM settings for the hbase module. Other 
modules such as elasticsearch are also doing that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-02-14 Thread fhueske
Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/3149
  
It's not only Java 7. For Hadoop 2.4.1 and Java 7, the builds pass. So it 
seems to be a combination of both parameters. Anyway, I think we can configure 
the JVM for the tests. 

It would be good to check if @rmetzger has any objections. 
He knows Maven and testing infrastructure much better than I do.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-02-13 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
@fhueske - Are you fine with that pom change? If so we can get this in.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-02-13 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
I think that is not the only reason. Some where either these tests are 
creating more static objects or bigger objects that live for the life time of 
the JVM. May be this test exposes it. Actually this change in maxPermsize calls 
for a discussion on how the 'mvn' settings should be and what is the minimum 
size of heap required. Also in other projects the default jdk version is now 
moved to 8 atleast for the trunk version. I think similar thing can be done 
here. Thanks @tonycox .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-02-13 Thread tonycox
Github user tonycox commented on the issue:

https://github.com/apache/flink/pull/3149
  
@ramkrish86 java8 has different memory model, 
https://dzone.com/articles/java-8-permgen-metaspace


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-02-13 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
@fhueske  - so are you ok with @tonycox  suggestion of setting MAxPermSize 
for hbase module?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-02-13 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
Even jhat was not able to view the file as it had a problem in parsing the 
hprof file. So my question is if MaxPermSize is 128M - why does it work with 
jdk 8? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-02-13 Thread tonycox
Github user tonycox commented on the issue:

https://github.com/apache/flink/pull/3149
  
@ramkrish86 You can use jhat to read hprof dump file. 
If set `-XX:MaxPermSize=128m`
```xml

org.apache.maven.plugins
maven-surefire-plugin
2.19.1

-XX:MaxPermSize=128m

1


```
all tests will pass



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-02-13 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
I tried multiple ways to take the dump file from this mvn test command run. 
I get a hprof file which on opening in heap dump analyser throws EOF exception 
or NPE exception. 
@tonycox - were you able to get any heap dump?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-02-12 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
I tried to compare TableInputFormat and the new one. But the interesting 
part is HBaseInputformat does not fail in jdk 8 which was my default in my test 
env.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-02-12 Thread fhueske
Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/3149
  
Hi @ramkrish86, @tonycox could you reproduce the OOME on a local setup? 
We might want to compare the new HBase InputFormat and the existing 
`TableInputFormat` which does not fail in tests.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-02-10 Thread tonycox
Github user tonycox commented on the issue:

https://github.com/apache/flink/pull/3149
  
@ramkrish86 Try to add 
```
-XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath="/tmp"
``` 
as parameter of JVM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-02-10 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
Thanks to @tonycox  for helping me in reproducing this error. Changing to 
JDK 7 creates this issue and it creates due to permGen space running out of 
memory. I don't have a soln for this. It runs with JDK 8 with no hassles. Any 
inputs here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-02-09 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
> Results :

Tests run: 2, Failures: 0, Errors: 0, Skipped: 0

This is what I get as test result.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-02-09 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
It is CentOS Linux release 7.0.1406 (Core). 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-02-09 Thread tonycox
Github user tonycox commented on the issue:

https://github.com/apache/flink/pull/3149
  
@ramkrish86 what is your environment?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-02-09 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
I tried even that. The test runs fine for me. No OOME I get.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-02-09 Thread tonycox
Github user tonycox commented on the issue:

https://github.com/apache/flink/pull/3149
  
Try to run in flink folder (I ran it onto ubuntu 16.04)
```bash
export JAVA_HOME=/path/to/java-7-oracle/jre
mvn -B -Dhadoop.version=2.3.0 \
test -Dtest=org.apache.flink.addons.hbase.example.HBaseTableSourceITCase \
-DfailIfNoTests=false
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-02-07 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
`02/02/2017 06:19:20DataSink (collect())(6/32) switched to 
SCHEDULED 
02/02/2017 06:19:20 DataSink (collect())(2/32) switched to SCHEDULED 
02/02/2017 06:19:20 DataSink (collect())(8/32) switched to SCHEDULED 
02/02/2017 06:19:20 DataSink (collect())(6/32) switched to DEPLOYING 
02/02/2017 06:19:20 DataSink (collect())(8/32) switched to DEPLOYING 
02/02/2017 06:19:20 DataSink (collect())(2/32) switched to DEPLOYING 
02/02/2017 06:19:25 DataSink (collect())(1/32) switched to SCHEDULED 
02/02/2017 06:19:25 DataSink (collect())(1/32) switched to DEPLOYING 
Exception: java.lang.OutOfMemoryError thrown from the 
UncaughtExceptionHandler in thread "JvmPauseMonitor"
Exception: java.lang.OutOfMemoryError thrown from the 
UncaughtExceptionHandler in thread "hconnection-0x2247c79b-shared--pool20-t1"
Exception: java.lang.OutOfMemoryError thrown from the 
UncaughtExceptionHandler in thread "JvmPauseMonitor"
Exception: java.lang.OutOfMemoryError thrown from the 
UncaughtExceptionHandler in thread "flink-akka.actor.default-dispatcher-3"
Exception: java.lang.OutOfMemoryError thrown from the 
UncaughtExceptionHandler in thread "IPC Server handler 1 on 34267"
Exception: java.lang.OutOfMemoryError thrown from the 
UncaughtExceptionHandler in thread "CHAIN DataSource (localhost:53456)"
Exception: java.lang.OutOfMemoryError thrown from the 
UncaughtExceptionHandler in thread "CHAIN DataSource (localhost:53456)"
Exception: java.lang.OutOfMemoryError thrown from the 
UncaughtExceptionHandler in thread "CHAIN DataSource (localhost:53456)"
Exception: java.lang.OutOfMemoryError thrown from the 
UncaughtExceptionHandler in thread "org.apache.hadoop.hdfs.PeerCache@28360f4a"`

The same tests run cleanly in my linux box. Is there any place I can 
download the logs to see what could be the issue?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-02-07 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
> Will need to figure out what's the reason for that before I can merge the 
PR.

I tried running those tests again in my linux box and all went through 
without any error.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-02-07 Thread fhueske
Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/3149
  
I observed an `OutOfMemoryError` in one of the Travis profiles when running 
the final tests. Will need to figure out what's the reason for that before I 
can merge the PR. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-02-07 Thread fhueske
Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/3149
  
Alright. The PR looks mostly good. I'll make a few refinements and will 
merge it then.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-02-06 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
@fhueske - A gentle reminder !!!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-02-02 Thread fhueske
Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/3149
  
Thanks for the update @ramkrish86 and your patience!
I skimmed over the PR and it looks good.
I'll do a final pass early next week before merging it.

Thanks, Fabian


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-02-02 Thread wuchong
Github user wuchong commented on the issue:

https://github.com/apache/flink/pull/3149
  
The `NestedFieldsProjectableTableSource` makes sense to me. +1 to implement 
it in a separate JIRA.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-02-01 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
@fhueske - Please have a look at the javadoc.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-02-01 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
> For now I'd suggest to keep the scope of the PR as it is right now. A bit 
more Java documentation on HBaseTableSource to explain how it is used would be 
great.
We can implement the NestedFieldsProjectableTableSource and the changes to 
HBaseTableSource in a follow up issue.

+1 for this. I can add some more javadoc to it. BTW am trying to checkout 
these Projections and using the ProjectabletableSource. Will be back on it. 
Thanks to every one for all the comments and feedback.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-02-01 Thread fhueske
Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/3149
  
Hi all,

thanks for the feedback. Let's stick to the nested schema then.

I think the best approach to support projections on nested fields is to 
implement a second interface (i.e., a trait without default implementation) 
called `NestedFieldsProjectableTableSource` as @tonycox suggested.
Adding a method with default implementation to `ProjectableTableSource` 
would not work, because this would turn this class into a Java abstract class 
while it is an interface now.
Using flat indicies is not a very nice solution either, IMO because it is 
not easy to parse.

For now I'd suggest to keep the scope of the PR as it is right now. A bit 
more Java documentation on `HBaseTableSource` to explain how it is used would 
be great.
We can implement the `NestedFieldsProjectableTableSource` and the changes 
to `HBaseTableSource` in a follow up issue.

What do you think?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-02-01 Thread wuchong
Github user wuchong commented on the issue:

https://github.com/apache/flink/pull/3149
  
Sorry for the late response. 

Regarding to the `HBaseTableSchema`, I agree with that to move the 
`addColumn(...)` method into `HBaseTableSource`. 

Regarding to the nested vs flat schema, I prefer the nested schema. It is 
more intuitive to use.
As for the nested schema doesn't support to push projections down, I think 
we should extend `ProjectableTableSource` to support push projections down to a 
composite type. We can keep the interface unchanged, i.e. `def 
projectFields(fields: Array[Int]): ProjectableTableSource[T]`. But the index of 
`fields` should be the flat index. We can use the flat field indexes to do 
projection pushdown even if it is a nested schema.

For example, a table source with schema `a: Int, b: Row, c: Boolean`, the flat indexes of `a, b.b1, b.b2, c` are `0, 1, 2, 3`. So 
a project `SELECT b.b1, c FROM T` will result a `fields` `Array(1,3)`.

What do you think ? 


For me the biggest drawback of a nested schema is the lacking support to 
push projections down. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-02-01 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
bq.ProjectableTableSource works in scan process. 
Ya got it. I was just trying to relate with this HBase thing and could find 
that we try to read all cols and then do a flatMap and then return the required 
cols alone. Just read that PR to understand better.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-02-01 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
I think going through the PR for 
https://issues.apache.org/jira/browse/FLINK-3848 - I think we try to project 
only the required columns. Similarly we could do here also. So my and @tonycox 
's suggestion of having a new way of ProjectableTableSource could help here. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-02-01 Thread tonycox
Github user tonycox commented on the issue:

https://github.com/apache/flink/pull/3149
  
@ramkrish86 ProjectableTableSource works in scan process. Without it 
TableScan is scanning all collumns and applying flatMap function on the whole 
data to project and filter. It's inefficient compared with pushing a projection 
and filtering right inside of scan proccess


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-02-01 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
`   Table result = tableEnv
.sql("SELECT test1.f1.q1, test1.f2.q2 FROM test1 where 
test1.f1.q1 < 103");`

I just tried this query and it works with or without 
ProjectableTableSource.  So just wanted to know when does the projection come 
into place. I thought without Projection things of this sort may not work. 




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-02-01 Thread tonycox
Github user tonycox commented on the issue:

https://github.com/apache/flink/pull/3149
  
@fhueske I preffer a better API with nesting. We can extend 
`ProjectableTableSource` with a method like this
```scala
def projectNestedFields(fields: Array[String]): ProjectableTableSource[T]
```
with default realisation.
override the method in HBase table source and while pushing projection down 
extract `fieldAccessor`s from `inputRex` and push them into as well.
Or create another `ProjectableNestedFieldsTableSource` with the same method.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-01-31 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
> Regarding the HBaseTableSchema, we could also use it only internally and 
not expose it to the user. The HBaseTableSource would have a method addColumn() 
and forward the calls to its internal HBaseSchema.

Have done this. I initially thought to do things in the construction time 
itself. Now added an addColumn() in hbaseTableSource. HBaseSchema becomes 
totally package private and no access to users.

Regarding the flatSchema, generally in hbase only family is required and 
the qualifiers are just dynamic. But here for the sake of accessibility we 
expect the user to specify the column names. (trying to give a relational 
look). It is in case of projections is where we have some issues. In my opinion 
if we have a better API for Projection may be we could handle it better? The 
current nested way as you said is better in the sense that all columns of a 
family are grouped together. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-01-31 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
Just a general question
`def projectFields(fields: Array[Int]): ProjectableTableSource[T]`

Is it mandatory to only have int[] here. Can we have String[] that allows 
to specify specific names? May be even more generic could be to have a 
ColumnarTableProjectableSource which allows to specify family to column mapping 
some way? Ultimately its the table source that is going to do the mapping and 
create the projected table source. So that should be fine?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-01-31 Thread fhueske
Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/3149
  
The points I raised effect the API and different people tend to have 
different opinions about APIs ;-).

I suggested to use a flat schema with that names columns 
`columnFamily$qualifier`, i.e., no nesting but composing the column name from 
`colFamily` and `qualifier` and separating them by `$`. Internally we can and 
should still use the `family`-`qualifier` but just map everything to a flat 
schema. The question is whether this would that make the `HBaseTableSource` 
harder to use. I don't think the column access (`family.qualifier` vs. 
`family$qualifier`) would be much harder, but acting with families that have 
lots of columns would be more cumbersome, because each column would be a 
top-level column and would need to be explicitly selected in a `SELECT` clause. 
In the nested case, all columns of a family are conveniently grouped together.

Regarding the `HBaseTableSchema`, we could also use it only internally and 
not expose it to the user. The `HBaseTableSource` would have a method 
`addColumn()` and forward the calls to its internal `HBaseSchema`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-01-31 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
Updated the PR fixing the comments. The comments were simple but the adding 
AbstractTableInputFormat and moving the code back and forth makes this one a 
bigger change. But internally they are just refactorings. The test cases passes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-01-31 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
To understand better

>  We could make flat schema an optional mode or implement it as a separate 
TableSource as well.
and this one

> This could be solved if we use a flat schema and encode the nesting as 
columnFamily$column

Are you talking about using seperators for it? May be am not getting your 
concern here. Ya I agree that nested schema is better API but if we go with 
flat schema then maintaining the family to qualifier relation may not be easy. 
As you said a seperate TableSource where we define such things would be better.
Regarding HBaseTableSchema I think that is better so that we could modify 
that class for better serialization and deserialization by adding more logic 
for different types of classes. Even when we go with flat schema I think this 
type of class would help us to maintain the logic of family to qualifier 
mapping?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-01-30 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
@tonycox 
I have addressed all your latest comments including making HBaseTableSource 
a ProjectableTableSource. 
@wuchong , @fhueske 
Are you guys fine with the latest updates. If so we can try closing this PR 
and further discussions of adding StreamingtableSource and supporting WHERE 
clauses can be done in subsequent PRs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-01-30 Thread tonycox
Github user tonycox commented on the issue:

https://github.com/apache/flink/pull/3149
  
@ramkrish86 sorry for confusing, don't pay attention to new generating maps.
I think `PushProjectIntoBatchTableSourceScanRule` is not good enough for 
nested data types. but we can project at least family columns now. 
`HBaseTableSchema` keep `familyMap` and might keep a list with family names.
for example, while `addColumn`
```java
List> list = this.familyMap.get(family);
if (list == null) {
familyNames.add(family);
list = new ArrayList<>();
}
```
so we have access to family column by index


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-01-29 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
Am not sure how we will manage this. But one thing that can be done is that 
if invalid family name is added to the scan - HBase internally throws 
FamilyNotFoundException - so that we can track and report back. But am not sure 
how to track the schema. So every time he wants to scan two families - the user 
has to call HBaseTableSchema#addColumns() twice (with the required qualifiers). 
@tonycox 
Can you help me understand - when you say ' so we can add more maps that 
generate after familyMap has set'?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-01-28 Thread tonycox
Github user tonycox commented on the issue:

https://github.com/apache/flink/pull/3149
  
`HBaseTableSchema` is initial information of table columns, so we can add 
more maps that generate after familyMap has set
```java
Map>> familyMap;
Map familyMapId;
Map columnMapId;
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-01-27 Thread wuchong
Github user wuchong commented on the issue:

https://github.com/apache/flink/pull/3149
  
I think the OutOfMemoryError error has nothing to do with this PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-01-27 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
For ProjectableTableSource I think I need some clarity. Because currently 
is based on int[] representing the fields. So am not sure how to map them in 
terms of qualifiers under a family.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-01-26 Thread tonycox
Github user tonycox commented on the issue:

https://github.com/apache/flink/pull/3149
  
@ramkrish86 I think you need implement projectable interface in HBase source


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-01-26 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
@fhueske ,
I have fixed the comments as per @wuchong . And he has said +1 after fixing 
them. Would like to see the PR and see if it is fine with you too.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-01-26 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
> Exception: java.lang.OutOfMemoryError thrown from the 
UncaughtExceptionHandler in thread 
"RpcServer.reader=6,bindAddress=testing-docker-bb4f2e37-e79f-42a3-a9e9-4995e42c70ba,port=45919"
Exception: java.lang.OutOfMemoryError thrown from the 
UncaughtExceptionHandler in thread "CHAIN DataSource (at 
getDataSet(HBaseTableSource.java:63) 
(org.apache.flink.addons.hbase.HBaseTableSourceInputFormat)) -> FlatMap 
(select: (f1.q1 AS q1, f1.q2 AS q2, f1.q3 AS q3)) (6/32)"
01/27/2017 05:57:12 DataSink (collect())(13/32) switched to DEPLOYING 
Exception: java.lang.OutOfMemoryError thrown from the 
UncaughtExceptionHandler in thread "LeaseRenewer:travis@localhost:39289"
01/27/2017 05:57:14 DataSink (collect())(12/32) switched to SCHEDULED 
Exception: java.lang.OutOfMemoryError thrown from the 
UncaughtExceptionHandler in thread "CHAIN DataSource (at 
getDataSet(HBaseTableSource.java:63) 
(org.apache.flink.addons.hbase.HBaseTableSourceInputFormat)) -> FlatMap 
(select: (f1.q1 AS q1, f1.q2 AS q2, f1.q3 AS q3)) (10/32)"

Getting this error in the travis build. But in my linux box it seems to 
pass.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-01-26 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
@wuchong 
I think I have updated the last comments from you. Thank you for all your 
help/support here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-01-25 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
Fixed all the minor comments given above. @tonycox , @wuchong , @fhueske .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-01-25 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
@fhueske , @tonycox , @wuchong - FYI.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-01-25 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
Updated the code with the comments and have pushed again. I think I have 
addressed all the comments here. Feedback/comments welcome. I also found that 
it is better to use the TableInputSplit to specify the start and end row so 
that the scan is anyway restricted to the given range.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-01-24 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
@fhueske , @tonycox , @wuchong 
I have updated the PR based on all the feedbacks here. Now you could see 
that we now support CompoisteRowType and we are able to specify multiple column 
families along with the qualifier names.
We are able to retrieve the result by doing a full scan.
This is not efficient and we need to specify start and end rows. I think 
that can be done after FilterableTableSource is done.
I have added test cases that shows single column family and double column 
family. 
For now if the TypeInformation is not known we use plain byte[] type only. 
That happens at the validation state itself. But one main concern from my side 
is how to present the 'NULL' means we specify a column with a type but there is 
no data for that column. For now I have handled by returning the Int, Float, 
Long - Min_values. But that may not be right I believe. Feedback and 
suggestions welcome.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-01-23 Thread wuchong
Github user wuchong commented on the issue:

https://github.com/apache/flink/pull/3149
  
Sounds good ! Looking forward that !


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-01-23 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
Good news is that with the help of this Composite RowType and modifying my 
code accordingly and debugging things I could get the basic thing to work. Now 
I will work on stitching things together and submitting a PR with updated 
changes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-01-23 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
Thanks for all the inputs here. I have been trying to make my existing code 
work with the composite RowTypeInfo. Once that is done I will try to introduce 
the HBaseTableSchema.
Also I would like to work on FLINK-3849 (FilterableTableSource) after this 
first version of HBaseTableSource is accepted.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-01-23 Thread fhueske
Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/3149
  
You are right @wuchong, we should break it down into two issues. 

I also agree about the serialization. We should offer defaults for 
primitives (`byte`, `short`, `int`, `long`, `boolean`, `float`, `double`) and a 
set of common character encodings (UTF-8, ASCII, etc.) for `String`. Everything 
else can be initially handled as `byte[]`, IMO.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-01-21 Thread wuchong
Github user wuchong commented on the issue:

https://github.com/apache/flink/pull/3149
  
Hi @fhueske , 

Regarding to the field type serialization, I think maybe we can provide 
default deserialization for basic types (int,long,String...) if users do use 
the `Bytes.toBytes(...)` to serialize the basic types. If not, users can ask 
this field to return raw bytes in this way:  `htableSchame.add("column_family", 
"qualifier", byte[].class)` and then use an user defined scalar function to 
deserialize the value.

regarding to the rowkeys, I agree with you. It would be great if we can set 
scan range by WHERE clause. But FLINK-3849 (FilterableTableSource) is still a 
pending PR, I would suggest to break this issue into two. 1. add 
HBaseTableSource, provide access to HBase tables and support nested schema. 2. 
extend HBaseTableSource to support FilterableTableSource.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-01-20 Thread fhueske
Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/3149
  
Hi @ramkrish86, @tonycox, and @wuchong,

sorry for joining the discussion a bit late. I haven't looked at the code 
yet, but I think the discussion is going into the right direction. 

I had a look at [how Apache Drill provides access to HBase 
tables](https://drill.apache.org/docs/querying-hbase/). Drill also uses a 
nested schema of `[rowkey, colfamily1[col1, col2, ...], colfamiliy2[col1, col2, 
...] ...]` so basically the same as we are discussing here.

Regarding the field types: The serialization is not under our control, so 
should also offer to just return the raw bytes (as Drill does). If users have 
custom data types or serialization logic they can use a user defined scalar 
function to extract the value. I don't know what's the standard serialization 
format for primitives with HBase (or if there is one at all). 

Regarding restricting the scan with rowkeys. @tonycox's PR for [filterable 
TableSources](https://github.com/apache/flink/pull/3166) can be used to set the 
scan range. This would be much better than "hardcoding" the scan ranges in the 
TableSource.

Best, Fabian


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-01-18 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
Thanks for the ping here @tonycox . 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-01-18 Thread tonycox
Github user tonycox commented on the issue:

https://github.com/apache/flink/pull/3149
  
As Jark Wu said in [jira](https://issues.apache.org/jira/browse/FLINK-5554) 

> I think the HBaseTableSource should return a composite type (with column 
family and qualifier), and we can get columns by composite type accessing.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-01-18 Thread tonycox
Github user tonycox commented on the issue:

https://github.com/apache/flink/pull/3149
  
We need discuss that


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-01-18 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
So am not sure if we can add specific API in table.api for hbase? Is that 
allowed?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-01-18 Thread tonycox
Github user tonycox commented on the issue:

https://github.com/apache/flink/pull/3149
  
And could you extend with `StreamTableSource` also ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-01-18 Thread tonycox
Github user tonycox commented on the issue:

https://github.com/apache/flink/pull/3149
  
I think it can be solved in different issue to provide a new api in 
`table.api` for selecting from HBase


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-01-18 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
Thanks @tonycox .Yes I am fine with it. I can try. Anyway I t hink there is 
more to do from my side. Now am not sure how to register the table with the 
valid family name and column name. It is only registering with the table and it 
is not resolving '.select("f1:q1, f1:q2, f1:q3");'.
I am able to run the test case only now and so finding these. I will wait 
for comments and then go on with the updation of the PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-01-18 Thread tonycox
Github user tonycox commented on the issue:

https://github.com/apache/flink/pull/3149
  
@ramkrish86 @fhueske what do you think about to throw `Tuple` (`T extends 
Tuple`)  out of `org.apache.flink.addons.hbase.TableInputFormat` and implement 
this abstract class in your `HBaseTableSourceInputFormat` ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-01-18 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
I see. Am not sure how I missed that because my IDE I thought was already 
updated with latest code. Will check it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-01-18 Thread fhueske
Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/3149
  
@tonycox is right. Please rebase your PR to the current master branch. The 
`TableSource` interface was recently modified.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-01-18 Thread tonycox
Github user tonycox commented on the issue:

https://github.com/apache/flink/pull/3149
  
You need to recompile `TableSource` trait manually and implement 
`DefinedFieldNames` in `HBaseTableSource`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3149: FLINK-2168 Add HBaseTableSource

2017-01-18 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/3149
  
And one more thing is that other than the BasicTypeInfo what other types 
should we support. I was not sure on that so added a TODO there.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---