Re: Updating samza-sql branch to Java 1.7

2015-04-15 Thread Milinda Pathirage
Thanks everyone. Milinda On Tue, Apr 14, 2015 at 6:06 PM, Yi Pan nickpa...@gmail.com wrote: Merged master to samza-sql. On Tue, Apr 14, 2015 at 2:57 PM, Jakob Homan jgho...@gmail.com wrote: Yes, I removed the tests for JDK6 yesterday. We're 1.7 or above now for development. On 14

Re: How to configure the Resource Manager endpoint for YARN?

2015-04-15 Thread Roger Hoover
I'll try that. Thanks, Chris. On Wed, Apr 15, 2015 at 9:37 AM, Chris Riccomini criccom...@apache.org wrote: Hey Roger, Not sure if this makes a difference, but have you tried using: export YARN_CONF_DIR=... Instead? This is what we use. Cheers, Chris On Wed, Apr 15, 2015 at 9:33

How to configure the Resource Manager endpoint for YARN?

2015-04-15 Thread Roger Hoover
Hi, I'm trying to deploy a job to a small YARN cluster. How do tell the launcher script where to find the Resource Manager? I tried creating a yarn-site.xml and setting HADOOP_CONF_DIR environment variable but it doesn't find my config. 2015-04-14 22:02:45 ClientHelper [INFO] trying to connect

Re: Extra Systems and other extensions.

2015-04-15 Thread Chinmay Soman
+1 ! I was going to do this for my use case as well. Would love to have this ! On Wed, Apr 15, 2015 at 9:24 AM, Roger Hoover roger.hoo...@gmail.com wrote: Dan, This is great. Would love to have a common ElasticSearch system producer. Cheers, Roger On Tue, Apr 14, 2015 at 1:34 PM, Dan

Maximum number of jobs

2015-04-15 Thread jeremy p
What's the maximum number of Samza jobs I can run simultaneously on a single cluster? Let's say these jobs are very lightweight -- they require little memory or processing power. However, I need a lot of them -- let's say I need to have 1,000,000 running at any given time. Is this reasonable or

Re: How to configure the Resource Manager endpoint for YARN?

2015-04-15 Thread Chris Riccomini
Hey Roger, Hmm, that's good to know, lol. Wonder how our's is working. :) I'll poke around. Cheers, Chris On Wed, Apr 15, 2015 at 11:17 AM, Roger Hoover roger.hoo...@gmail.com wrote: Turns out that HADOOP_CONF_DIR is the right env var (YARN_CONF_DIR did not work). I had just messed up the

Re: Review Request 33219: [SAMZA-649] Create samza-sql-calcite module for Calcite SQL front end

2015-04-15 Thread Yi Pan (Data Infrastructure)
On April 15, 2015, 6:20 p.m., Yi Pan (Data Infrastructure) wrote: samza-sql-calcite/src/main/java/org/apache/samza/sql/calcite/schema/AvroSchemaConverter.java, line 37 https://reviews.apache.org/r/33219/diff/1/?file=930371#file930371line37 I assume that this class is used to

Re: Review Request 33219: [SAMZA-649] Create samza-sql-calcite module for Calcite SQL front end

2015-04-15 Thread Milinda Pathirage
On April 15, 2015, 6:20 p.m., Yi Pan (Data Infrastructure) wrote: samza-sql-calcite/src/main/java/org/apache/samza/sql/calcite/planner/QueryPlanner.java, line 61 https://reviews.apache.org/r/33219/diff/1/?file=930367#file930367line61 One quick question: do we need to implement

Re: Review Request 33219: [SAMZA-649] Create samza-sql-calcite module for Calcite SQL front end

2015-04-15 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/33219/#review80237 --- Ship it! +1 - Yi Pan (Data Infrastructure) On April 15, 2015,

Re: How to deal with bootstrapping

2015-04-15 Thread Yan Fang
Hi Jeremy, If my understanding is correct, whenever you add a new rule, you want to apply this rule to the historical data. Right? If you do not care about duplication, you can create a new task that contains existing rules and new rules. Configure bootstrap. This will apply all the rules from

Re: How to deal with bootstrapping

2015-04-15 Thread jeremy p
Hello Yan, Thank you for the suggestion! I think your solution would work, however, I am afraid it would create a performance problem for our users. Let's say we kill the Classifier task, and create a new Classifier task with both the existing rules and new rules. We get the offset of the

Re: Review Request 33219: [SAMZA-649] Create samza-sql-calcite module for Calcite SQL front end

2015-04-15 Thread Milinda Pathirage
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/33219/ --- (Updated April 15, 2015, 6:56 p.m.) Review request for samza, Chris Riccomini

Re: How to deal with bootstrapping

2015-04-15 Thread Yan Fang
Hi Jeremy, In order to reach this goal, we have to assume that the job with new rules can always catch up with the one with old rules. Otherwise, I think we do not have the choice but running a lot of jobs simultaneously. Under our assumption, we have job1 with old rules running, and now add

Re: Samza Unit Test Instrucations

2015-04-15 Thread Yan Fang
Hi Yuanchi, There is no out-of-box unit tests provided by Samza. But there are some ways: 1) If you only want to test the logic in the Task class, normal unit tests will work. You can create a unit test that tests init(), process(), etc. 2) Create mock systems by implementing SystemAdmin,

Samza Unit Test Instrucations

2015-04-15 Thread Yuanchi Ning
Hello Samza Team, This is Yuanchi Ning from Uber Data Engineering, Realtime Metrics, Streaming Platform team. We are planning to use Samza to process the realtime data we have, and thanks for developing such an awesome open source project. While I am building our streaming service using Samza,