Unable to run the sample job
Hi All, I started tinkering with source code of Amaterasu and just wanted to confirm if I am missing any step. Here are the steps that I followed : 1. Installed a single node mesos cluster (version 1.4) in Ubuntu 16.0.4 2. Generated the amaterasu tar file post building the project from source. 3. Tested if mesos by executing the following command : mesos-execute --master=$MASTER --name="cluster-test" --command="sleep 5" 4. Ran the command for deploying the spark job using Amaterasu ./ama-start.sh --repo="https://github.com/shintoio/amaterasu-job-sample.git"; --branch="master" --env="test" --report="code" Following is error log repo: https://github.com/shintoio/amaterasu-job-sample.git 2017-10-30 14:00:09.761:INFO::main: Logging initialized @430ms 2017-10-30 14:00:09.829:INFO:oejs.Server:main: jetty-9.2.z-SNAPSHOT 2017-10-30 14:00:09.864:INFO:oejsh.ContextHandler:main: Started o.e.j.s.ServletContextHandler@8c3b9d {/,file:/home/shad/apps/apache-amaterasu-0.2.0-incubating/dist/,AVAILABLE} 2017-10-30 14:00:09.882:INFO:oejs.ServerConnector:main: Started ServerConnector@58ea606c{HTTP/1.1}{0.0.0.0:8000} 2017-10-30 14:00:09.882:INFO:oejs.Server:main: Started @553ms SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. I1030 14:00:10.085204 16586 sched.cpp:232] Version: 1.4.0 I1030 14:00:10.088812 16630 sched.cpp:336] New master detected at master@127.0.1.1:5050 I1030 14:00:10.089205 16630 sched.cpp:352] No credentials provided. Attempting to register without authentication I1030 14:00:10.090991 16624 sched.cpp:759] Framework registered with e72b9609-4b7f-4509-b1a4-bd2055d674aa-0002 Exception in thread "Thread-13" while scanning for the next token found character '\t(TAB)' that cannot start any token. (Do not use \t(TAB) for indentation) in 'reader', line 2, column 1: "name":"test", ^ at org.yaml.snakeyaml.scanner.ScannerImpl.fetchMoreTokens(ScannerImpl.java:420) at org.yaml.snakeyaml.scanner.ScannerImpl.checkToken(ScannerImpl.java:226) at org.yaml.snakeyaml.parser.ParserImpl$ParseImplicitDocumentStart.produce(ParserImpl.java:194) at org.yaml.snakeyaml.parser.ParserImpl.peekEvent(ParserImpl.java:157) at org.yaml.snakeyaml.parser.ParserImpl.checkEvent(ParserImpl.java:147) at org.yaml.snakeyaml.composer.Composer.getSingleNode(Composer.java:104) at org.yaml.snakeyaml.constructor.BaseConstructor.getSingleData(BaseConstructor.java:122) at org.yaml.snakeyaml.Yaml.loadFromReader(Yaml.java:505) at org.yaml.snakeyaml.Yaml.load(Yaml.java:436) at org.apache.amaterasu.leader.utilities.DataLoader$.yamlToMap(DataLoader.scala:87) at org.apache.amaterasu.leader.utilities.DataLoader$$anonfun$3.apply(DataLoader.scala:68) at org.apache.amaterasu.leader.utilities.DataLoader$$anonfun$3.apply(DataLoader.scala:68) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186) at org.apache.amaterasu.leader.utilities.DataLoader$.getExecutorData(DataLoader.scala:68) at org.apache.amaterasu.leader.mesos.schedulers.JobScheduler$$anonfun$resourceOffers$1.apply(JobScheduler.scala:163) at org.apache.amaterasu.leader.mesos.schedulers.JobScheduler$$anonfun$resourceOffers$1.apply(JobScheduler.scala:128) at scala.collection.Iterator$class.foreach(Iterator.scala:891) at scala.collection.AbstractIterator.foreach(Iterator.scala:1334) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at org.apache.amaterasu.leader.mesos.schedulers.JobScheduler.resourceOffers(JobScheduler.scala:128) I1030 14:00:12.991056 16624 sched.cpp:2055] Asked to abort the driver I1030 14:00:12.991195 16624 sched.cpp:1233] Aborting framework e72b9609-4b7f-4509-b1a4-bd2055d674aa-0002 Thanks, Shad
Re: Unable to run the sample job
Thanks Yaniv for the reply. The job now starts but completed with some errors. I wanted to know if the current implementation has the ability trigger the job from the Spark batch jar file, instead of reading from a github repo ? Or should this be considered as a feature request ? PFB the error logs. ===> Executor 02-efce1eff-b351-4bc3-9bb2-776af04d3b3b registered ===> a provider for group spark was created ===> launching task: 02 ===> = started action start = ===> val data = 1 to 1000 ===> val rdd = sc.parallelize(data) ===> val odd = rdd.filter(n => n%2 != 0).toDF("number") ===> = finished action start = ===> complete task: 02 ===> launching task: 03 ===> = started action start = ===> val highNoDf = AmaContext.getDataFrame("start", "odd").where("number > 100") ===> highNoDf.write.mode(SaveMode.Overwrite).json(s"${env.outputRootPath}/dtatframe_res") ===> org.apache.spark.SparkException: Job aborted. ===> Executor 03-ada561dd-f38b-4741-9ca5-eeb20780bbcf registered ===> a provider for group spark was created ===> launching task: 03 ===> = started action step2 = ===> val highNoDf = AmaContext.getDataFrame("start", "odd").where("number > 100") ===> highNoDf.write.mode(SaveMode.Overwrite).json(s"${env.outputRootPath}/dtatframe_res") ===> org.apache.spark.SparkException: Job aborted. ===> Executor 03-c65bcfd5-215b-46dc-9bd8-b38d31f50bb1 registered ===> a provider for group spark was created ===> launching task: 03 ===> = started action step2 = ===> val highNoDf = AmaContext.getDataFrame("start", "odd").where("number > 100") ===> highNoDf.write.mode(SaveMode.Overwrite).json(s"${env.outputRootPath}/dtatframe_res") ===> org.apache.spark.SparkException: Job aborted. ===> moving to err action null 2017-10-31 10:53:37.730:INFO:oejs.ServerConnector:Thread-73: Stopped ServerConnector@58ea606c{HTTP/1.1}{0.0.0.0:8000} 2017-10-31 10:53:37.732:INFO:oejsh.ContextHandler:Thread-73: Stopped o.e.j.s.ServletContextHandler@8c3b9d {/,file:/home/shad/apps/apache-amaterasu-0.2.0-incubating/dist/,UNAVAILABLE} I1031 10:53:37.733896 36249 sched.cpp:2021] Asked to stop the driver I1031 10:53:37.734133 36249 sched.cpp:1203] Stopping framework afd772c2-509b-4782-a4c6-4cd9a2985cc2-0001 W00t amaterasu job is finished!!! Thanks, Shad On Tue, Oct 31, 2017 at 5:02 AM, Yaniv Rodenski wrote: > Hi Shad, > > sorry about that, there was indeed an issue with the job definition. > Should be fine now. > > Cheers, > Yaniv > > On Mon, Oct 30, 2017 at 9:02 PM, Shad Amez wrote: > > > Hi All, > > > > I started tinkering with source code of Amaterasu and just wanted to > > confirm if I am missing any step. > > > > Here are the steps that I followed : > > > > 1. Installed a single node mesos cluster (version 1.4) in Ubuntu 16.0.4 > > 2. Generated the amaterasu tar file post building the project from > source. > > 3. Tested if mesos by executing the following command : > >mesos-execute --master=$MASTER --name="cluster-test" --command="sleep > 5" > > 4. Ran the command for deploying the spark job using Amaterasu > > ./ama-start.sh --repo="https://github.com/shintoio/amaterasu-job-sample > . > > git" > > --branch="master" --env="test" --report="code" > > > > Following is error log > > > > repo: https://github.com/shintoio/amaterasu-job-sample.git > > 2017-10-30 14:00:09.761:INFO::main: Logging initialized @430ms > > 2017-10-30 14:00:09.829:INFO:oejs.Server:main: jetty-9.2.z-SNAPSHOT > > 2017-10-30 14:00:09.864:INFO:oejsh.ContextHandler:main: Started > > o.e.j.s.ServletContextHandler@8c3b9d > > {/,file:/home/shad/apps/apache-amaterasu-0.2.0- > incubating/dist/,AVAILABLE} > > 2017-10-30 14:00:09.882:INFO:oejs.ServerConnector:main: Started > > ServerConnector@58ea606c{HTTP/1.1}{0.0.0.0:8000} > > 2017-10-30 14:00:09.882:INFO:oejs.Server:main: Started @553ms > > SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". > > SLF4J: Defaulting to no-operation (NOP) logger implementation > > SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for > further > > details. > > I1030 14:00:10.085204 16586 sched.cpp:232] Version: 1.4.0 > > I1030 14:00:10.088812 16630 sched.cpp:336] New master detected at > > master@127.0.1.1:5050 > > I1030 14:00:10.089205 16630
Re: Unable to run the sample job
Thanks Yaniv for the reply. The job now starts but completed with some errors. I wanted to know if the current implementation has the ability trigger the job from the Spark batch jar file, instead of reading from a github repo ? Or should this be considered as a feature request ? PFB the error logs. ===> Executor 02-efce1eff-b351-4bc3-9bb2-776af04d3b3b registered ===> a provider for group spark was created ===> launching task: 02 ===> = started action start = ===> val data = 1 to 1000 ===> val rdd = sc.parallelize(data) ===> val odd = rdd.filter(n => n%2 != 0).toDF("number") ===> = finished action start = ===> complete task: 02 ===> launching task: 03 ===> = started action start = ===> val highNoDf = AmaContext.getDataFrame("start", "odd").where("number > 100") ===> highNoDf.write.mode(SaveMode.Overwrite).json(s"${env. outputRootPath}/dtatframe_res") ===> org.apache.spark.SparkException: Job aborted. ===> Executor 03-ada561dd-f38b-4741-9ca5-eeb20780bbcf registered ===> a provider for group spark was created ===> launching task: 03 ===> = started action step2 = ===> val highNoDf = AmaContext.getDataFrame("start", "odd").where("number > 100") ===> highNoDf.write.mode(SaveMode.Overwrite).json(s"${env. outputRootPath}/dtatframe_res") ===> org.apache.spark.SparkException: Job aborted. ===> Executor 03-c65bcfd5-215b-46dc-9bd8-b38d31f50bb1 registered ===> a provider for group spark was created ===> launching task: 03 ===> = started action step2 = ===> val highNoDf = AmaContext.getDataFrame("start", "odd").where("number > 100") ===> highNoDf.write.mode(SaveMode.Overwrite).json(s"${env. outputRootPath}/dtatframe_res") ===> org.apache.spark.SparkException: Job aborted. ===> moving to err action null 2017-10-31 10:53:37.730:INFO:oejs.ServerConnector:Thread-73: Stopped ServerConnector@58ea606c{HTTP/1.1}{0.0.0.0:8000} 2017-10-31 10:53:37.732:INFO:oejsh.ContextHandler:Thread-73: Stopped o.e.j.s.ServletContextHandler@8c3b9d{/,file:/home/shad/apps/ apache-amaterasu-0.2.0-incubating/dist/,UNAVAILABLE} I1031 10:53:37.733896 36249 sched.cpp:2021] Asked to stop the driver I1031 10:53:37.734133 36249 sched.cpp:1203] Stopping framework afd772c2-509b-4782-a4c6-4cd9a2985cc2-0001 W00t amaterasu job is finished!!! Thanks, Shad On Tue, Oct 31, 2017 at 5:02 AM, Yaniv Rodenski wrote: > Hi Shad, > > sorry about that, there was indeed an issue with the job definition. > Should be fine now. > > Cheers, > Yaniv > > On Mon, Oct 30, 2017 at 9:02 PM, Shad Amez wrote: > > > Hi All, > > > > I started tinkering with source code of Amaterasu and just wanted to > > confirm if I am missing any step. > > > > Here are the steps that I followed : > > > > 1. Installed a single node mesos cluster (version 1.4) in Ubuntu 16.0.4 > > 2. Generated the amaterasu tar file post building the project from > source. > > 3. Tested if mesos by executing the following command : > >mesos-execute --master=$MASTER --name="cluster-test" --command="sleep > 5" > > 4. Ran the command for deploying the spark job using Amaterasu > > ./ama-start.sh --repo="https://github.com/shintoio/amaterasu-job-sample > . > > git" > > --branch="master" --env="test" --report="code" > > > > Following is error log > > > > repo: https://github.com/shintoio/amaterasu-job-sample.git > > 2017-10-30 14:00:09.761:INFO::main: Logging initialized @430ms > > 2017-10-30 14:00:09.829:INFO:oejs.Server:main: jetty-9.2.z-SNAPSHOT > > 2017-10-30 14:00:09.864:INFO:oejsh.ContextHandler:main: Started > > o.e.j.s.ServletContextHandler@8c3b9d > > {/,file:/home/shad/apps/apache-amaterasu-0.2.0- > incubating/dist/,AVAILABLE} > > 2017-10-30 14:00:09.882:INFO:oejs.ServerConnector:main: Started > > ServerConnector@58ea606c{HTTP/1.1}{0.0.0.0:8000} > > 2017-10-30 14:00:09.882:INFO:oejs.Server:main: Started @553ms > > SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". > > SLF4J: Defaulting to no-operation (NOP) logger implementation > > SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for > further > > details. > > I1030 14:00:10.085204 16586 sched.cpp:232] Version: 1.4.0 > > I1030 14:00:10.088812 16630 sched.cpp:336] New master detected at > > master@127.0.1.1:5050 > > I1030 14:00:10.089205 16630
Re: [VOTE] Amaterasu release 0.2.0-incubating, release candidate #1
Hi Nadav/Team, FYI. I have faced these test failures : [Thread-3] ERROR org.apache.curator.test.TestingZooKeeperServer - . java.net.BindException: Address already in use This is caused when there is another instance of Zookeeper running on the port 2181. If this zookeeper instance is shutdown, then the tests run sucessfully Regards, Shad On Tue, Apr 17, 2018 at 9:58 AM, Yaniv Rodenski wrote: > Hi Nadav, > > OK, I did have a closer look this morning on a clean environment and it > seems that there is something wrong with the build coming out of Travis. > I suggest we stop the vote for now and Guy and myself will investigate. > > Thanks, everyone > Yaniv > > On Tue, Apr 17, 2018 at 4:27 AM, Nadav Har Tzvi > wrote: > > > -1 > > > > There are a few issues: > > > > 1. Travis doesn't invoke gradlew test, thus the tests don't run at all in > > the CI environment. > > 2. When I deployed Amaterasu on both the mesos vagrant box and the HDP > box, > > the action tests failed. > > Here are the stack traces: > > > > [Thread-3] ERROR org.apache.curator.test.TestingZooKeeperServer - From > > testing server (random state: false) for instance: > > InstanceSpec{dataDirectory=/tmp/1523902177319-0, port=2181, election > > Port=38827, quorumPort=40951, deleteDataDirectoryOnClose=true, > serverId=3, > > tickTime=-1, maxClientCnxns=-1} org.apache.curator.test.InstanceSpec@885 > > > > java.net.BindException: Address already in use > > > > > > at sun.nio.ch.Net.bind0(Native Method) > > > > > > at sun.nio.ch.Net.bind(Net.java:433) > > > > > > at sun.nio.ch.Net.bind(Net.java:425) > > > > > > at > > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java: > 223) > > > > > > at > > sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > > > > > > at > > sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67) > > > > > > at > > org.apache.zookeeper.server.NIOServerCnxnFactory.configure( > > NIOServerCnxnFactory.java:95) > > > > > > at > > org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig( > > ZooKeeperServerMain.java:111) > > > > > > at > > org.apache.curator.test.TestingZooKeeperMain.runFromConfig( > > TestingZooKeeperMain.java:73) > > > > > > at > > org.apache.curator.test.TestingZooKeeperServer$1.run( > > TestingZooKeeperServer.java:148) > > > > > > at java.lang.Thread.run(Thread.java:748) > > > > org.apache.amaterasu.common.execution.ActionTests *** ABORTED *** (3 > > milliseconds) > > java.lang.RuntimeException: Unable to load a Suite class that was > > discovered in the runpath: org.apache.amaterasu.common. > > execution.ActionTests > > at > > org.scalatest.tools.DiscoverySuite$.getSuiteInstance( > > DiscoverySuite.scala:81) > > at > > org.scalatest.tools.DiscoverySuite$$anonfun$1. > > apply(DiscoverySuite.scala:38) > > at > > org.scalatest.tools.DiscoverySuite$$anonfun$1. > > apply(DiscoverySuite.scala:37) > > at > > scala.collection.TraversableLike$$anonfun$map$ > > 1.apply(TraversableLike.scala:234) > > at > > scala.collection.TraversableLike$$anonfun$map$ > > 1.apply(TraversableLike.scala:234) > > at scala.collection.Iterator$class.foreach(Iterator.scala:891) > > at scala.collection.AbstractIterator.foreach(Iterator.scala:1334) > > at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) > > at scala.collection.AbstractIterable.foreach(Iterable.scala:54) > > at scala.collection.TraversableLike$class.map( > TraversableLike.scala:234) > > ... > > Cause: java.net.BindException: Address already in use > > at sun.nio.ch.Net.bind0(Native Method) > > at sun.nio.ch.Net.bind(Net.java:433) > > at sun.nio.ch.Net.bind(Net.java:425) > > at > > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java: > 223) > > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67) > > at > > org.apache.zookeeper.server.NIOServerCnxnFactory.configure( > > NIOServerCnxnFactory.java:95) > > at > > org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig( > > ZooKeeperServerMain.java:111) > > at > > org.apache.curator.test.TestingZooKeeperMain.runFromConfig( > > TestingZooKeeperMain.java:73) > > > > at > > org.apache.curator.test.TestingZooKeeperServer$1.run( > > TestingZooKeeperServer.java:148) > > > > 3. I received the above error also when testing in the local development > > environment. > > > > Do other committers manage to reproduce this? Eyal? Kirupa? > > > > > > > > > > On 16 Apr 2018, at 16:46, Yaniv Rodenski wrote: > > > > Hi everyone, > > > > Please review and vote on the release candidate #1 for the version > > 0.2.0-incubating, as follows: > > > > [ ] +1, Approve the release > > > > [ ] -1, Do not approve the release (please provide specific comments) > > > > > > The complete staging area is available for your review, which includes: > > > >
[jira] [Commented] (AMATERASU-4) Run an Amaterasu pipeline
[ https://issues.apache.org/jira/browse/AMATERASU-4?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16223311#comment-16223311 ] Shad Amez commented on AMATERASU-4: --- Is this project a fork of https://github.com/shintoio/amaterasu ? Can I get access to source code/wiki/website to explore and contribute to the project ? > Run an Amaterasu pipeline > - > > Key: AMATERASU-4 > URL: https://issues.apache.org/jira/browse/AMATERASU-4 > Project: AMATERASU > Issue Type: Sub-task >Reporter: Nadav Har Tzvi > Original Estimate: 168h > Remaining Estimate: 168h > > The user will invoke "ama run" > "ama run" will take in the following parameters (based on ama-start.sh): > -r, --repo = > -b, --branch = , the default is "master" > -e, --env = , this should correspond to a path under /env > directory, e.g. /env/default, /env/test, etc. The default value is "default" > -n, --name = > -i, --job-id = TBD > -r, --report = > Invocation will start Amaterasu on demand. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (AMATERASU-20) Build fails on travis containers
[ https://issues.apache.org/jira/browse/AMATERASU-20?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16430342#comment-16430342 ] Shad Amez commented on AMATERASU-20: [~yaniv] I have fixed the build error, which was caused due to incorrect handling of data path with upperCase/CamelCase . Let me know if it's ok to send the pull request. > Build fails on travis containers > > > Key: AMATERASU-20 > URL: https://issues.apache.org/jira/browse/AMATERASU-20 > Project: AMATERASU > Issue Type: Bug >Affects Versions: 0.2.0-incubating >Reporter: Yaniv Rodenski >Assignee: Yaniv Rodenski >Priority: Major > Fix For: 0.2.0-incubating > > > The build is failing due to absolute paths in some tests -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (AMATERASU-21) Fix Spark scala tests
[ https://issues.apache.org/jira/browse/AMATERASU-21?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shad Amez reassigned AMATERASU-21: -- Assignee: Shad Amez > Fix Spark scala tests > - > > Key: AMATERASU-21 > URL: https://issues.apache.org/jira/browse/AMATERASU-21 > Project: AMATERASU > Issue Type: Task >Affects Versions: 0.2.1-incubating >Reporter: Yaniv Rodenski > Assignee: Shad Amez >Priority: Major > > Spark Scala tests are currently commented out and need to be fixed by being > added to the spark test suit and added back in. -- This message was sent by Atlassian JIRA (v7.6.3#76005)