Task scheduling
Hi, How can I configure task schedule to be able to specify: - Some task can only run ins some nodes (not all) - Some task shouldn't run together but if there is only one server then they can ( In this case a have some task that are high CPU demanding, so I don't want to run them together, but if there's not other way.. well... let them run) - priority among tasks The yarn fair scheduler gives me some of this functionality (but not all), and I also don't know how to specify the scheduling queue im samza (if there is a way). Can you please help me ? Thank you -- Ing. Alvaro Gareppe agare...@gmail.com
SAMZA build failing!!!
Hi, I was n't able to build SAMZA to execute the Samza jobs. Receiving below exception while executing samza-core_2.10. I checkedout the master branch from https://github.com/apache/samza.git and trying to build!! * What went wrong: Execution failed for task ':samza-core_2.10:test'. :samza-core_2.10:processTestResources :samza-core_2.10:testClasses :samza-core_2.10:checkstyleTest :samza-core_2.10:test testCanReadPropertiesConfigFiles FAILED java.lang.IllegalArgumentException: Illegal character in authority at index 7: file://samza1\samza-core/src/test/resources/test.properties at java.net.URI.create(URI.java:859) at org.apache.samza.config.factories.TestPropertiesConfigFactory.testCanReadPropertiesConfigFiles(TestPropertiesConfigFactory.scala:34) Caused by: java.net.URISyntaxException: Illegal character in authority at index 7: file://samza1\samza-core/src/test/resources/test.properties at java.net.URI$Parser.fail(URI.java:2829) at java.net.URI$Parser.parseAuthority(URI.java:3167) at java.net.URI$Parser.parseHierarchical(URI.java:3078) at java.net.URI$Parser.parse(URI.java:3034) at java.net.URI.init(URI.java:595) at java.net.URI.create(URI.java:857) ... 1 more Can someone please help me fix this. Thank you. Regards, Raja Mahesh Aravapalli.
Re: [Discuss/Vote] upgrade to Yarn 2.6.0
Hi Roger, If you have plan to upgrade to 2.6.0, and no other companies are using 2.4.0, I think we can upgrade to 2.6.0 yarn in 0.10.0. Thanks, Fang, Yan yanfang...@gmail.com On Thu, Aug 20, 2015 at 4:48 PM, Yi Pan nickpa...@gmail.com wrote: Hi, Selina, Samza 0.9.1 on YARN 2.6 is the proved working solution. Best, -Yi On Thu, Aug 20, 2015 at 12:28 PM, Selina Tech swucaree...@gmail.com wrote: Hi, Yi: If I use Samza0.9.1 and Yarn2.6.0, Will the system be failed? Sincerely, Selina On Wed, Aug 19, 2015 at 1:58 PM, Yi Pan nickpa...@gmail.com wrote: Hi, Roger, In LinkedIn we have already moved to YARN 2.6 and is moving to YARN 2.7 now. I am not aware of any major issues in upgrading. I will let our team member Jon Bringhurst to chime in since he did all the upgrade and may have more insights. @Jon, could you help to comment on this? Thanks! -Yi On Wed, Aug 19, 2015 at 9:12 AM, Roger Hoover roger.hoo...@gmail.com wrote: We're using 2.4.0 in production. Are there any major incompatibilities to watch out for when upgrading to 2.6.0? Thanks, Roger On Mon, Aug 17, 2015 at 4:41 PM, Yan Fang yanfang...@gmail.com wrote: Hi guys, we have been discussing upgrading to Yarn 2.6.0 (SAMZA-536 https://issues.apache.org/jira/browse/SAMZA-536), because there are some bug fixes after 2.4.0 and we can not enable the Yarn RM recovering feature in Yarn 2.4.0 (SAMZA-750 https://issues.apache.org/jira/browse/SAMZA-750 ) . So we just want to make sure if any production users are still using Yarn 2.4.0 and do not plan to upgrade to 2.6.0+? If not further concern, I think we can go and upgrade to Yarn 2.6.0 in Samza 0.10.0 release. Thanks, Fang, Yan yanfang...@gmail.com
Re: SAMZA build failing!!!
Hi Raja, Do you only run samza-core or the whole samza project? I downloaded the samza from master branch and run ./gradlew clean build. There is no error. Could you give a little more information how you get this error? Thanks, Fang, Yan yanfang...@gmail.com On Mon, Aug 24, 2015 at 9:54 AM, Raja.Aravapalli raja.aravapa...@target.com wrote: Hi, I was n't able to build SAMZA to execute the Samza jobs. Receiving below exception while executing samza-core_2.10. I checkedout the master branch from https://github.com/apache/samza.git and trying to build!! * What went wrong: Execution failed for task ':samza-core_2.10:test'. :samza-core_2.10:processTestResources :samza-core_2.10:testClasses :samza-core_2.10:checkstyleTest :samza-core_2.10:test testCanReadPropertiesConfigFiles FAILED java.lang.IllegalArgumentException: Illegal character in authority at index 7: file://samza1\samza-core/src/test/resources/test.properties at java.net.URI.create(URI.java:859) at org.apache.samza.config.factories.TestPropertiesConfigFactory.testCanReadPropertiesConfigFiles(TestPropertiesConfigFactory.scala:34) Caused by: java.net.URISyntaxException: Illegal character in authority at index 7: file://samza1\samza-core/src/test/resources/test.properties at java.net.URI$Parser.fail(URI.java:2829) at java.net.URI$Parser.parseAuthority(URI.java:3167) at java.net.URI$Parser.parseHierarchical(URI.java:3078) at java.net.URI$Parser.parse(URI.java:3034) at java.net.URI.init(URI.java:595) at java.net.URI.create(URI.java:857) ... 1 more Can someone please help me fix this. Thank you. Regards, Raja Mahesh Aravapalli.
Re: Review Request 37604: SAMZA-760 Samza Container should catch Throwables instead of just catching Exceptions
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/37604/#review96145 --- Overall, LGTM. Could you also add a unit test to verify this? Thank you. samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala (line 581) https://reviews.apache.org/r/37604/#comment151430 throwable, not exception in the log msg - Yan Fang On Aug. 19, 2015, 8:14 a.m., Aleksandar Bircakovic wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/37604/ --- (Updated Aug. 19, 2015, 8:14 a.m.) Review request for samza. Repository: samza Description --- Added a catch for Throwables in Samza container. Catching Throwables can cause problems in specific situations so I also added a partial function 'safely' that should take care of that specific situations. Diffs - samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala 85b012b Diff: https://reviews.apache.org/r/37604/diff/ Testing --- Thanks, Aleksandar Bircakovic
Re: [Discuss/Vote] upgrade to Yarn 2.6.0
Works for me. Sent from my iPhone On Aug 24, 2015, at 7:48 AM, Yan Fang yanfang...@gmail.com wrote: Hi Roger, If you have plan to upgrade to 2.6.0, and no other companies are using 2.4.0, I think we can upgrade to 2.6.0 yarn in 0.10.0. Thanks, Fang, Yan yanfang...@gmail.com On Thu, Aug 20, 2015 at 4:48 PM, Yi Pan nickpa...@gmail.com wrote: Hi, Selina, Samza 0.9.1 on YARN 2.6 is the proved working solution. Best, -Yi On Thu, Aug 20, 2015 at 12:28 PM, Selina Tech swucaree...@gmail.com wrote: Hi, Yi: If I use Samza0.9.1 and Yarn2.6.0, Will the system be failed? Sincerely, Selina On Wed, Aug 19, 2015 at 1:58 PM, Yi Pan nickpa...@gmail.com wrote: Hi, Roger, In LinkedIn we have already moved to YARN 2.6 and is moving to YARN 2.7 now. I am not aware of any major issues in upgrading. I will let our team member Jon Bringhurst to chime in since he did all the upgrade and may have more insights. @Jon, could you help to comment on this? Thanks! -Yi On Wed, Aug 19, 2015 at 9:12 AM, Roger Hoover roger.hoo...@gmail.com wrote: We're using 2.4.0 in production. Are there any major incompatibilities to watch out for when upgrading to 2.6.0? Thanks, Roger On Mon, Aug 17, 2015 at 4:41 PM, Yan Fang yanfang...@gmail.com wrote: Hi guys, we have been discussing upgrading to Yarn 2.6.0 (SAMZA-536 https://issues.apache.org/jira/browse/SAMZA-536), because there are some bug fixes after 2.4.0 and we can not enable the Yarn RM recovering feature in Yarn 2.4.0 (SAMZA-750 https://issues.apache.org/jira/browse/SAMZA-750 ) . So we just want to make sure if any production users are still using Yarn 2.4.0 and do not plan to upgrade to 2.6.0+? If not further concern, I think we can go and upgrade to Yarn 2.6.0 in Samza 0.10.0 release. Thanks, Fang, Yan yanfang...@gmail.com
Re: [Discuss/Vote] upgrade to Yarn 2.6.0
Thanks a lot, Roger! @Yan, I think that we already have +1 x 3 (binding) and +1 x2 (non-binding). If no further objection, we can close the vote and change the description of SAMZA-563 to reflect that we are deprecating the support for YARN 2.4 and 2.5. Best, -Yi On Mon, Aug 24, 2015 at 8:31 AM, Roger Hoover roger.hoo...@gmail.com wrote: Works for me. Sent from my iPhone On Aug 24, 2015, at 7:48 AM, Yan Fang yanfang...@gmail.com wrote: Hi Roger, If you have plan to upgrade to 2.6.0, and no other companies are using 2.4.0, I think we can upgrade to 2.6.0 yarn in 0.10.0. Thanks, Fang, Yan yanfang...@gmail.com On Thu, Aug 20, 2015 at 4:48 PM, Yi Pan nickpa...@gmail.com wrote: Hi, Selina, Samza 0.9.1 on YARN 2.6 is the proved working solution. Best, -Yi On Thu, Aug 20, 2015 at 12:28 PM, Selina Tech swucaree...@gmail.com wrote: Hi, Yi: If I use Samza0.9.1 and Yarn2.6.0, Will the system be failed? Sincerely, Selina On Wed, Aug 19, 2015 at 1:58 PM, Yi Pan nickpa...@gmail.com wrote: Hi, Roger, In LinkedIn we have already moved to YARN 2.6 and is moving to YARN 2.7 now. I am not aware of any major issues in upgrading. I will let our team member Jon Bringhurst to chime in since he did all the upgrade and may have more insights. @Jon, could you help to comment on this? Thanks! -Yi On Wed, Aug 19, 2015 at 9:12 AM, Roger Hoover roger.hoo...@gmail.com wrote: We're using 2.4.0 in production. Are there any major incompatibilities to watch out for when upgrading to 2.6.0? Thanks, Roger On Mon, Aug 17, 2015 at 4:41 PM, Yan Fang yanfang...@gmail.com wrote: Hi guys, we have been discussing upgrading to Yarn 2.6.0 (SAMZA-536 https://issues.apache.org/jira/browse/SAMZA-536), because there are some bug fixes after 2.4.0 and we can not enable the Yarn RM recovering feature in Yarn 2.4.0 (SAMZA-750 https://issues.apache.org/jira/browse/SAMZA-750 ) . So we just want to make sure if any production users are still using Yarn 2.4.0 and do not plan to upgrade to 2.6.0+? If not further concern, I think we can go and upgrade to Yarn 2.6.0 in Samza 0.10.0 release. Thanks, Fang, Yan yanfang...@gmail.com
Re: [Discuss/Vote] upgrade to Yarn 2.6.0
Roger, We upgraded from yarn 2.4 to 2.6 a while ago and been running it in prod with no issues. It was basically a drop in if I remember right. Jordan On Aug 20, 2015, at 1:48 PM, Yi Pan nickpa...@gmail.com wrote: Hi, Selina, Samza 0.9.1 on YARN 2.6 is the proved working solution. Best, -Yi On Thu, Aug 20, 2015 at 12:28 PM, Selina Tech swucaree...@gmail.com wrote: Hi, Yi: If I use Samza0.9.1 and Yarn2.6.0, Will the system be failed? Sincerely, Selina On Wed, Aug 19, 2015 at 1:58 PM, Yi Pan nickpa...@gmail.com wrote: Hi, Roger, In LinkedIn we have already moved to YARN 2.6 and is moving to YARN 2.7 now. I am not aware of any major issues in upgrading. I will let our team member Jon Bringhurst to chime in since he did all the upgrade and may have more insights. @Jon, could you help to comment on this? Thanks! -Yi On Wed, Aug 19, 2015 at 9:12 AM, Roger Hoover roger.hoo...@gmail.com wrote: We're using 2.4.0 in production. Are there any major incompatibilities to watch out for when upgrading to 2.6.0? Thanks, Roger On Mon, Aug 17, 2015 at 4:41 PM, Yan Fang yanfang...@gmail.com wrote: Hi guys, we have been discussing upgrading to Yarn 2.6.0 (SAMZA-536 https://issues.apache.org/jira/browse/SAMZA-536), because there are some bug fixes after 2.4.0 and we can not enable the Yarn RM recovering feature in Yarn 2.4.0 (SAMZA-750 https://issues.apache.org/jira/browse/SAMZA-750 ) . So we just want to make sure if any production users are still using Yarn 2.4.0 and do not plan to upgrade to 2.6.0+? If not further concern, I think we can go and upgrade to Yarn 2.6.0 in Samza 0.10.0 release. Thanks, Fang, Yan yanfang...@gmail.com
New Samza blog published - http://engineering.linkedin.com/performance/benchmarking-apache-samza-12-million-messages-second-single-node
Hi Samza open source, I want to share that Tao Feng https://www.linkedin.com/pub/tao-feng/14/958/171 (from LinkedIn's Performance Team) has published a blog post http://engineering.linkedin.com/performance/benchmarking-apache-samza-12-million-messages-second-single-node on Samza perf benchmarks in collaboration with our development team. A lot of hard work went into this blog post so please join me in congratulating Tao and other contributors, and please share to your Big Data social media circles. Synopsis: *T**he objective of the blog is to measure Samza's performance in terms of the message-processing rate for a single machine for typical Samza use cases. T**his will help engineers to understand and to optimize performance and provide a basis for establishing a capacity model to run Samza platform as a service.* Thanks, Ed Yakabosky Streams Infrastructure TPM, LinkedIn
Re: Review Request 37506: WIP: SAMZA-552 Operator API change: New Builder API
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/37506/#review96136 --- Hi, Milinda, sorry for the late review. I have put down my comments below. Overall, there are two things to be discussed: 1) Adding OperatorBuilder interface as well. It serves two purposes: a) I remember that we have discussed the need for this due to the fact that in the parsing/planning phase, there are cases where the required parameters for the operator are not generated / finalized yet (hence you have added some setter functions in OperatorSpec as workaround). W/ OperatorBuilder, it is much easier that we just keep setting the parameters w/o calling build() b) In the user code directly using operator layer API, using OperatorBuilder can help to make the TopologyBuilder code more intuitive and helps to hide away all unnecessary specs s.t. intermediate stream/table names and/or operator names 2) The implementation details of TopologyBuilder. I would prefer still keep a graph-based implementation of TopologyBuilder internally, instead of a stack-based implementation, due to the flexible representation the graph-based implementation is able to. At the API, we should first focus on DAG-like operators. However, I would prefer to keep the implementation flexible to avoid having to re-write the TopologyBuilder class later, when we need to support non-DAG-like operators. p.s. It would be good if you can modify the example tasks using the fluent-style APIs to illustrate how the user experience is. And w/ the help from OperatorBuilder, the TopologyBuilder implementaion can achieve this: if user does not specify the input/output streams/tables (like in DAG-like operators), TopologyBuilder should be able to figure out and generate the intermediate streams/tables names and connect the operators via those intermediate streams/tables. This is a step we must do anyways for DAG-like oper ators. If the user specifies the input/output streams in the OperatorBuilder, the named streams/tables are created as vertices in the graph and operators are now connected to those vertices if they consume from those streams/tables. This is a simple extension from the DAG model that does not need structure-change in the TopologyBuilder. Just my two cents. Thanks! samza-sql-core/src/main/java/org/apache/samza/sql/api/data/EntityName.java (line 161) https://reviews.apache.org/r/37506/#comment151391 My original intention to introduce the anonymous stream here is to represent the intermediate streams/tables. If we explicitly introduced the intermediate streams and tables in the following methods, I think that we can drop the anonymous ones. samza-sql-core/src/main/java/org/apache/samza/sql/api/data/EntityName.java (line 174) https://reviews.apache.org/r/37506/#comment151392 Could you elaborate more on what to be fixed here? samza-sql-core/src/main/java/org/apache/samza/sql/api/expressions/ScalarExpression.java (line 28) https://reviews.apache.org/r/37506/#comment151393 Is this going to the interface exposed to users who are writing SQL tasks? It would be good to think of not using the generic Object class in the interface classes between the Samza framework vs user code, to follow the spirit in SAMZA-697. samza-sql-core/src/main/java/org/apache/samza/sql/api/expressions/TupleExpression.java (line 28) https://reviews.apache.org/r/37506/#comment151394 Same here. samza-sql-core/src/main/java/org/apache/samza/sql/api/operators/OperatorSink.java (line 19) https://reviews.apache.org/r/37506/#comment151395 I think that in the new TopologyBuilder + OperatorBuilder, there is a way to remove the OperatorSink and OperatorSource interfaces. The main purpose for those interfaces to exist is the requirement to refer to the partial topology that a) has one output; Or b) has one input that has not been bound to a system stream/table or an intermediate stream/table. I have thought about that if we follow an API similar to trident, any immediately connected operators won't require the sink/source interfaces, and any not-immediately connected operators will need to connect via a named intermediate stream/table. Hence, removing the need to create OperatorSink/OperatorSource classes. samza-sql-core/src/main/java/org/apache/samza/sql/data/IncomingMessageTuple.java https://reviews.apache.org/r/37506/#comment151396 nit: I still think that a note here stressing the need to get the real event time instead of the message's receive time based on local system is important. samza-sql-core/src/main/java/org/apache/samza/sql/operators/factory/SimpleOperatorFactoryImpl.java (line 77) https://reviews.apache.org/r/37506/#comment151397 Why do we need this? I thought that we can directly produce to the system streams, w/ the
RE: SAMZA build failing!!!
Hi Fang, I followed below steps: 1. I downloaded the code from https://github.com/apache/samza.git, cloned to desktop. 2. cd into code directory 3. Ran gradle -b bootstrap.gradle 4. then tried below two ways, it doesn't work in either a. gradlew -PscalaVersion=2.10 clean build b. gradlew clean build while running tests for samza-core that build is failing while execuitng samza-core:test task. And below is the error message what I am receiving... == 1 warning :samza-autoscaling_2.10:javadoc :samza-autoscaling_2.10:javadocJar :samza-autoscaling_2.10:sourcesJar :samza-autoscaling_2.10:signArchives SKIPPED :samza-autoscaling_2.10:assemble :samza-autoscaling_2.10:checkstyleMain :samza-autoscaling_2.10:compileTestJava UP-TO-DATE :samza-autoscaling_2.10:compileTestScala UP-TO-DATE :samza-autoscaling_2.10:processTestResources UP-TO-DATE :samza-autoscaling_2.10:testClasses UP-TO-DATE :samza-autoscaling_2.10:checkstyleTest UP-TO-DATE :samza-autoscaling_2.10:test UP-TO-DATE :samza-autoscaling_2.10:check :samza-autoscaling_2.10:build :samza-core_2.10:javadocJar :samza-core_2.10:sourcesJar :samza-core_2.10:signArchives SKIPPED :samza-core_2.10:assemble :samza-core_2.10:checkstyleMain :samza-core_2.10:compileTestJava Note: C:\Users\z013sqm\Desktop\POCs\samza1 - Copy\samza-core\src\test\java\org\apache\samza\coordinator\stream\TestCoordinatorStreamWriter.java uses unchecked or unsafe operat ions. Note: Recompile with -Xlint:unchecked for details. :samza-core_2.10:compileTestScala [ant:scalac] Element samza1 - Copy\samza-core\build\resources\main' does not exist. :samza-core_2.10:processTestResources :samza-core_2.10:testClasses :samza-core_2.10:checkstyleTest :samza-core_2.10:test testCanReadPropertiesConfigFiles FAILED java.lang.IllegalArgumentException: Illegal character in authority at index 7: file://samza1 - Copy\samza-core/src/test/resources/test.proper ties at java.net.URI.create(URI.java:859) at org.apache.samza.config.factories.TestPropertiesConfigFactory.testCanReadPropertiesConfigFiles(TestPropertiesConfigFactory.scala:34) Caused by: java.net.URISyntaxException: Illegal character in authority at index 7: file://samza1 - Copy\samza-core/src/test/resources/test.propertie s at java.net.URI$Parser.fail(URI.java:2829) at java.net.URI$Parser.parseAuthority(URI.java:3167) at java.net.URI$Parser.parseHierarchical(URI.java:3078) at java.net.URI$Parser.parse(URI.java:3034) at java.net.URI.init(URI.java:595) at java.net.URI.create(URI.java:857) ... 1 more testStorageEngineReceivedAllValues FAILED org.junit.ComparisonFailure: expected:[/tmp/testing/state/testStore/]Partition_1 but was:[\tmp\testing\state\testStore\]Partition_1 at org.junit.Assert.assertEquals(Assert.java:123) at org.junit.Assert.assertEquals(Assert.java:145) at org.apache.samza.storage.TestStorageRecovery.testStorageEngineReceivedAllValues(TestStorageRecovery.java:84) 150 tests completed, 3 failed, 1 skipped :samza-core_2.10:test FAILED FAILURE: Build failed with an exception. * What went wrong: Execution failed for task ':samza-core_2.10:test'. There were failing tests. See the report at: file:///samza1%20-%20Copy/samza-core/build/reports/tests/index.html * Try: Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. BUILD FAILED == Please guide me what I am doing wrong!! Also, please suggest me some document, where we can get build steps... Regards, Raja Mahesh Aravapalli. -Original Message- From: Yan Fang [mailto:yanfang...@gmail.com] Sent: Monday, August 24, 2015 8:10 PM To: dev@samza.apache.org Subject: Re: SAMZA build failing!!! Hi Raja, Do you only run samza-core or the whole samza project? I downloaded the samza from master branch and run ./gradlew clean build. There is no error. Could you give a little more information how you get this error? Thanks, Fang, Yan yanfang...@gmail.com On Mon, Aug 24, 2015 at 9:54 AM, Raja.Aravapalli raja.aravapa...@target.com wrote: Hi, I was n't able to build SAMZA to execute the Samza jobs. Receiving below exception while executing samza-core_2.10. I checkedout the master branch from https://github.com/apache/samza.git and trying to build!! * What went wrong: Execution failed for task ':samza-core_2.10:test'. :samza-core_2.10:processTestResources :samza-core_2.10:testClasses :samza-core_2.10:checkstyleTest :samza-core_2.10:test testCanReadPropertiesConfigFiles FAILED java.lang.IllegalArgumentException: Illegal character in authority at index 7: file://samza1\samza-core/src/test/resources/test.properties