Re: Reading from HBase problem

2015-06-09 Thread Fabian Hueske
Would you mind opening a JIRA for this issue? - https://issues.apache.org/jira/browse/FLINK I can do it as well, but you know all the details. Thanks, Fabian 2015-06-09 11:03 GMT+02:00 Hilmi Yildirim hilmi.yildi...@neofonie.de: I want to add that I run the Flink job on a cluster with 13

Re: Apache Flink transactions

2015-06-09 Thread Aljoscha Krettek
Hi, we don't have any current performance numbers. But the queries mentioned on the benchmark page should be easy to implement in Flink. It could be interesting if someone ported these queries and ran them with exactly the same data on the same machines. Bill Sparks wrote on the mailing list some

Re: Reading from HBase problem

2015-06-09 Thread Hilmi Yildirim
Done https://issues.apache.org/jira/browse/FLINK-2188 Am 09.06.2015 um 11:26 schrieb Fabian Hueske: Would you mind opening a JIRA for this issue? - https://issues.apache.org/jira/browse/FLINK I can do it as well, but you know all the details. Thanks, Fabian 2015-06-09 11:03 GMT+02:00 Hilmi

Re: Reading from HBase problem

2015-06-09 Thread fhueske
Thank you very much! From: Hilmi Yildirim Sent: ‎Tuesday‎, ‎9‎. ‎June‎, ‎2015 ‎11‎:‎40 To: user@flink.apache.org Done https://issues.apache.org/jira/browse/FLINK-2188 Am 09.06.2015 um 11:26 schrieb Fabian Hueske: Would you mind opening a JIRA for this issue? -

Re: Reading from HBase problem

2015-06-09 Thread Fabian Hueske
OK, so the problem seems to be with the HBase InputFormat. I guess this issue needs a bit of debugging. We need to check if records are emitted twice (or more often) and if that is the case which records. Unfortunately, this issue only seems to occur with large tables :-( Did I got that right,

Re: Reading from HBase problem

2015-06-09 Thread Hilmi Yildirim
Hi, Now I tested the count method. It returns the same result as the flatmap.groupBy(0).sum(1) method. Furthermore, the Hbase contains nearly 100 mio. rows but the result is 102 mio.. This means that the HbaseInput reads more rows than the HBase contains. Best Regards, Hilmi Am 08.06.2015

Re: Reading from HBase problem

2015-06-09 Thread Hilmi Yildirim
Correct. I also counted the rows with Spark and Hive. Both returned the same value which is nearly 100 mio. rows. But Flink returns 102 mio. rows. Best Regards, Hilmi Am 09.06.2015 um 10:47 schrieb Fabian Hueske: OK, so the problem seems to be with the HBase InputFormat. I guess this issue

Re: when return value from linkedlist or map and use in filter function display error

2015-06-09 Thread hagersaleh
error dispaly non-static variable map cannot be referenced from a static context map.put(C_MKTSEGMENT, 2); code public MapString, Integer map = new HashMapString, Integer(); public static void main(String[] args) throws Exception { map.put(C_MKTSEGMENT, 2);

Load balancing

2015-06-09 Thread Kruse, Sebastian
Hi folks, I would like to do some load balancing within one of my Flink jobs to achieve good scalability. The rebalance() method is not applicable in my case, as the runtime is dominated by the processing of very few larger elements in my dataset. Hence, I need to distribute the processing

Re: when return value from linkedlist or map and use in filter function display error

2015-06-09 Thread hagersaleh
I can solve problem when final MapString, Integer map = new HashMapString, Integer(); very thanks code run in command line not any error public static void main(String[] args) throws Exception { final MapString, Integer map = new HashMapString, Integer();

Re: when return value from linkedlist or map and use in filter function display error

2015-06-09 Thread Robert Metzger
Great! I'm happy to hear that it worked. On Tue, Jun 9, 2015 at 5:28 PM, hagersaleh loveallah1...@yahoo.com wrote: I can solve problem when final MapString, Integer map = new HashMapString, Integer(); very thanks code run in command line not any error public static void

Re: Load balancing

2015-06-09 Thread Fabian Hueske
Hi Sebastian, I agree, shuffling only specific elements would be a very useful feature, but unfortunately it's not supported (yet). Would you like to open a JIRA for that? Cheers, Fabian 2015-06-09 17:22 GMT+02:00 Kruse, Sebastian sebastian.kr...@hpi.de: Hi folks, I would like to do some

Re: Apache Flink transactions

2015-06-09 Thread Hawin Jiang
On Tue, Jun 9, 2015 at 2:29 AM, Aljoscha Krettek aljos...@apache.org wrote: Hi, we don't have any current performance numbers. But the queries mentioned on the benchmark page should be easy to implement in Flink. It could be interesting if someone ported these queries and ran them with

Re: Apache Flink transactions

2015-06-09 Thread Hawin Jiang
Hey Aljoscha I also sent an email to Bill for asking the latest test results. From Bill's email, Apache Spark performance looks like better than Flink. How about your thoughts. Best regards Hawin On Tue, Jun 9, 2015 at 2:29 AM, Aljoscha Krettek aljos...@apache.org wrote: Hi, we don't

Re: Reading from HBase problem

2015-06-09 Thread Hilmi Yildirim
Hi Ufuk, I used the TableInput format from flink-addons. Best Regards, Hilmi Am 09.06.2015 um 13:17 schrieb Ufuk Celebi: Hey Hilmi, thanks for reporting the issue. Sorry for the inconvenience this has caused. I'm not familiar with HBase in combination with Flink. From what I've seen, there