Unable to run the sample job

2017-10-30 Thread Shad Amez
Hi All,

I started tinkering with source code of Amaterasu and just wanted to
confirm if I am missing any step.

Here are the steps that I followed :

1. Installed a single node mesos cluster (version 1.4) in Ubuntu 16.0.4
2. Generated the amaterasu tar file post building the project from source.
3. Tested if mesos by executing the following command :
   mesos-execute --master=$MASTER --name="cluster-test" --command="sleep 5"
4. Ran the command for deploying the spark job using Amaterasu
 ./ama-start.sh --repo="https://github.com/shintoio/amaterasu-job-sample.git";
--branch="master" --env="test" --report="code"

Following is error log

repo: https://github.com/shintoio/amaterasu-job-sample.git
2017-10-30 14:00:09.761:INFO::main: Logging initialized @430ms
2017-10-30 14:00:09.829:INFO:oejs.Server:main: jetty-9.2.z-SNAPSHOT
2017-10-30 14:00:09.864:INFO:oejsh.ContextHandler:main: Started
o.e.j.s.ServletContextHandler@8c3b9d
{/,file:/home/shad/apps/apache-amaterasu-0.2.0-incubating/dist/,AVAILABLE}
2017-10-30 14:00:09.882:INFO:oejs.ServerConnector:main: Started
ServerConnector@58ea606c{HTTP/1.1}{0.0.0.0:8000}
2017-10-30 14:00:09.882:INFO:oejs.Server:main: Started @553ms
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further
details.
I1030 14:00:10.085204 16586 sched.cpp:232] Version: 1.4.0
I1030 14:00:10.088812 16630 sched.cpp:336] New master detected at
master@127.0.1.1:5050
I1030 14:00:10.089205 16630 sched.cpp:352] No credentials provided.
Attempting to register without authentication
I1030 14:00:10.090991 16624 sched.cpp:759] Framework registered with
e72b9609-4b7f-4509-b1a4-bd2055d674aa-0002
Exception in thread "Thread-13" while scanning for the next token
found character '\t(TAB)' that cannot start any token. (Do not use \t(TAB)
for indentation)
 in 'reader', line 2, column 1:
"name":"test",
^

at
org.yaml.snakeyaml.scanner.ScannerImpl.fetchMoreTokens(ScannerImpl.java:420)
at org.yaml.snakeyaml.scanner.ScannerImpl.checkToken(ScannerImpl.java:226)
at
org.yaml.snakeyaml.parser.ParserImpl$ParseImplicitDocumentStart.produce(ParserImpl.java:194)
at org.yaml.snakeyaml.parser.ParserImpl.peekEvent(ParserImpl.java:157)
at org.yaml.snakeyaml.parser.ParserImpl.checkEvent(ParserImpl.java:147)
at org.yaml.snakeyaml.composer.Composer.getSingleNode(Composer.java:104)
at
org.yaml.snakeyaml.constructor.BaseConstructor.getSingleData(BaseConstructor.java:122)
at org.yaml.snakeyaml.Yaml.loadFromReader(Yaml.java:505)
at org.yaml.snakeyaml.Yaml.load(Yaml.java:436)
at
org.apache.amaterasu.leader.utilities.DataLoader$.yamlToMap(DataLoader.scala:87)
at
org.apache.amaterasu.leader.utilities.DataLoader$$anonfun$3.apply(DataLoader.scala:68)
at
org.apache.amaterasu.leader.utilities.DataLoader$$anonfun$3.apply(DataLoader.scala:68)
at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186)
at
org.apache.amaterasu.leader.utilities.DataLoader$.getExecutorData(DataLoader.scala:68)
at
org.apache.amaterasu.leader.mesos.schedulers.JobScheduler$$anonfun$resourceOffers$1.apply(JobScheduler.scala:163)
at
org.apache.amaterasu.leader.mesos.schedulers.JobScheduler$$anonfun$resourceOffers$1.apply(JobScheduler.scala:128)
at scala.collection.Iterator$class.foreach(Iterator.scala:891)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at
org.apache.amaterasu.leader.mesos.schedulers.JobScheduler.resourceOffers(JobScheduler.scala:128)
I1030 14:00:12.991056 16624 sched.cpp:2055] Asked to abort the driver
I1030 14:00:12.991195 16624 sched.cpp:1233] Aborting framework
e72b9609-4b7f-4509-b1a4-bd2055d674aa-0002

Thanks,
Shad


Re: Unable to run the sample job

2017-10-31 Thread Shad Amez
Thanks Yaniv for the reply. The job now starts but completed with some
errors.

I wanted to know if the current implementation has the ability trigger the
job from the Spark batch jar file, instead of reading from a github repo ?
Or should this be considered as a feature request ?

PFB the error logs.

===> Executor 02-efce1eff-b351-4bc3-9bb2-776af04d3b3b registered
===> a provider for group spark was created
===> launching task: 02
===> = started action start =
===> val data = 1 to 1000
===> val rdd = sc.parallelize(data)
===> val odd = rdd.filter(n => n%2 != 0).toDF("number")
===> = finished action start =
===> complete task: 02
===> launching task: 03
===> = started action start =
===> val highNoDf = AmaContext.getDataFrame("start", "odd").where("number >
100")
===>
highNoDf.write.mode(SaveMode.Overwrite).json(s"${env.outputRootPath}/dtatframe_res")
===> org.apache.spark.SparkException: Job aborted.
===> Executor 03-ada561dd-f38b-4741-9ca5-eeb20780bbcf registered
===> a provider for group spark was created
===> launching task: 03
===> = started action step2 =
===> val highNoDf = AmaContext.getDataFrame("start", "odd").where("number >
100")
===>
highNoDf.write.mode(SaveMode.Overwrite).json(s"${env.outputRootPath}/dtatframe_res")
===> org.apache.spark.SparkException: Job aborted.
===> Executor 03-c65bcfd5-215b-46dc-9bd8-b38d31f50bb1 registered
===> a provider for group spark was created
===> launching task: 03
===> = started action step2 =
===> val highNoDf = AmaContext.getDataFrame("start", "odd").where("number >
100")
===>
highNoDf.write.mode(SaveMode.Overwrite).json(s"${env.outputRootPath}/dtatframe_res")
===> org.apache.spark.SparkException: Job aborted.
===> moving to err action null
2017-10-31 10:53:37.730:INFO:oejs.ServerConnector:Thread-73: Stopped
ServerConnector@58ea606c{HTTP/1.1}{0.0.0.0:8000}
2017-10-31 10:53:37.732:INFO:oejsh.ContextHandler:Thread-73: Stopped
o.e.j.s.ServletContextHandler@8c3b9d
{/,file:/home/shad/apps/apache-amaterasu-0.2.0-incubating/dist/,UNAVAILABLE}
I1031 10:53:37.733896 36249 sched.cpp:2021] Asked to stop the driver
I1031 10:53:37.734133 36249 sched.cpp:1203] Stopping framework
afd772c2-509b-4782-a4c6-4cd9a2985cc2-0001


W00t amaterasu job is finished!!!

Thanks,
Shad

On Tue, Oct 31, 2017 at 5:02 AM, Yaniv Rodenski  wrote:

> Hi Shad,
>
> sorry about that, there was indeed an issue with the job definition.
> Should be fine now.
>
> Cheers,
> Yaniv
>
> On Mon, Oct 30, 2017 at 9:02 PM, Shad Amez  wrote:
>
> > Hi All,
> >
> > I started tinkering with source code of Amaterasu and just wanted to
> > confirm if I am missing any step.
> >
> > Here are the steps that I followed :
> >
> > 1. Installed a single node mesos cluster (version 1.4) in Ubuntu 16.0.4
> > 2. Generated the amaterasu tar file post building the project from
> source.
> > 3. Tested if mesos by executing the following command :
> >mesos-execute --master=$MASTER --name="cluster-test" --command="sleep
> 5"
> > 4. Ran the command for deploying the spark job using Amaterasu
> >  ./ama-start.sh --repo="https://github.com/shintoio/amaterasu-job-sample
> .
> > git"
> > --branch="master" --env="test" --report="code"
> >
> > Following is error log
> >
> > repo: https://github.com/shintoio/amaterasu-job-sample.git
> > 2017-10-30 14:00:09.761:INFO::main: Logging initialized @430ms
> > 2017-10-30 14:00:09.829:INFO:oejs.Server:main: jetty-9.2.z-SNAPSHOT
> > 2017-10-30 14:00:09.864:INFO:oejsh.ContextHandler:main: Started
> > o.e.j.s.ServletContextHandler@8c3b9d
> > {/,file:/home/shad/apps/apache-amaterasu-0.2.0-
> incubating/dist/,AVAILABLE}
> > 2017-10-30 14:00:09.882:INFO:oejs.ServerConnector:main: Started
> > ServerConnector@58ea606c{HTTP/1.1}{0.0.0.0:8000}
> > 2017-10-30 14:00:09.882:INFO:oejs.Server:main: Started @553ms
> > SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
> > SLF4J: Defaulting to no-operation (NOP) logger implementation
> > SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for
> further
> > details.
> > I1030 14:00:10.085204 16586 sched.cpp:232] Version: 1.4.0
> > I1030 14:00:10.088812 16630 sched.cpp:336] New master detected at
> > master@127.0.1.1:5050
> > I1030 14:00:10.089205 16630

Re: Unable to run the sample job

2017-10-31 Thread Shad Amez
Thanks Yaniv for the reply. The job now starts but completed with some
errors.

I wanted to know if the current implementation has the ability trigger the
job from the Spark batch jar file, instead of reading from a github repo ?
Or should this be considered as a feature request ?

PFB the error logs.

===> Executor 02-efce1eff-b351-4bc3-9bb2-776af04d3b3b registered
===> a provider for group spark was created
===> launching task: 02
===> = started action start =
===> val data = 1 to 1000
===> val rdd = sc.parallelize(data)
===> val odd = rdd.filter(n => n%2 != 0).toDF("number")
===> = finished action start =
===> complete task: 02
===> launching task: 03
===> = started action start =
===> val highNoDf = AmaContext.getDataFrame("start", "odd").where("number >
100")
===> highNoDf.write.mode(SaveMode.Overwrite).json(s"${env.
outputRootPath}/dtatframe_res")
===> org.apache.spark.SparkException: Job aborted.
===> Executor 03-ada561dd-f38b-4741-9ca5-eeb20780bbcf registered
===> a provider for group spark was created
===> launching task: 03
===> = started action step2 =
===> val highNoDf = AmaContext.getDataFrame("start", "odd").where("number >
100")
===> highNoDf.write.mode(SaveMode.Overwrite).json(s"${env.
outputRootPath}/dtatframe_res")
===> org.apache.spark.SparkException: Job aborted.
===> Executor 03-c65bcfd5-215b-46dc-9bd8-b38d31f50bb1 registered
===> a provider for group spark was created
===> launching task: 03
===> = started action step2 =
===> val highNoDf = AmaContext.getDataFrame("start", "odd").where("number >
100")
===> highNoDf.write.mode(SaveMode.Overwrite).json(s"${env.
outputRootPath}/dtatframe_res")
===> org.apache.spark.SparkException: Job aborted.
===> moving to err action null
2017-10-31 10:53:37.730:INFO:oejs.ServerConnector:Thread-73: Stopped
ServerConnector@58ea606c{HTTP/1.1}{0.0.0.0:8000}
2017-10-31 10:53:37.732:INFO:oejsh.ContextHandler:Thread-73: Stopped
o.e.j.s.ServletContextHandler@8c3b9d{/,file:/home/shad/apps/
apache-amaterasu-0.2.0-incubating/dist/,UNAVAILABLE}
I1031 10:53:37.733896 36249 sched.cpp:2021] Asked to stop the driver
I1031 10:53:37.734133 36249 sched.cpp:1203] Stopping framework
afd772c2-509b-4782-a4c6-4cd9a2985cc2-0001


W00t amaterasu job is finished!!!

Thanks,
Shad

On Tue, Oct 31, 2017 at 5:02 AM, Yaniv Rodenski  wrote:

> Hi Shad,
>
> sorry about that, there was indeed an issue with the job definition.
> Should be fine now.
>
> Cheers,
> Yaniv
>
> On Mon, Oct 30, 2017 at 9:02 PM, Shad Amez  wrote:
>
> > Hi All,
> >
> > I started tinkering with source code of Amaterasu and just wanted to
> > confirm if I am missing any step.
> >
> > Here are the steps that I followed :
> >
> > 1. Installed a single node mesos cluster (version 1.4) in Ubuntu 16.0.4
> > 2. Generated the amaterasu tar file post building the project from
> source.
> > 3. Tested if mesos by executing the following command :
> >mesos-execute --master=$MASTER --name="cluster-test" --command="sleep
> 5"
> > 4. Ran the command for deploying the spark job using Amaterasu
> >  ./ama-start.sh --repo="https://github.com/shintoio/amaterasu-job-sample
> .
> > git"
> > --branch="master" --env="test" --report="code"
> >
> > Following is error log
> >
> > repo: https://github.com/shintoio/amaterasu-job-sample.git
> > 2017-10-30 14:00:09.761:INFO::main: Logging initialized @430ms
> > 2017-10-30 14:00:09.829:INFO:oejs.Server:main: jetty-9.2.z-SNAPSHOT
> > 2017-10-30 14:00:09.864:INFO:oejsh.ContextHandler:main: Started
> > o.e.j.s.ServletContextHandler@8c3b9d
> > {/,file:/home/shad/apps/apache-amaterasu-0.2.0-
> incubating/dist/,AVAILABLE}
> > 2017-10-30 14:00:09.882:INFO:oejs.ServerConnector:main: Started
> > ServerConnector@58ea606c{HTTP/1.1}{0.0.0.0:8000}
> > 2017-10-30 14:00:09.882:INFO:oejs.Server:main: Started @553ms
> > SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
> > SLF4J: Defaulting to no-operation (NOP) logger implementation
> > SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for
> further
> > details.
> > I1030 14:00:10.085204 16586 sched.cpp:232] Version: 1.4.0
> > I1030 14:00:10.088812 16630 sched.cpp:336] New master detected at
> > master@127.0.1.1:5050
> > I1030 14:00:10.089205 16630

Re: [VOTE] Amaterasu release 0.2.0-incubating, release candidate #1

2018-04-16 Thread Shad Amez
Hi Nadav/Team,

FYI. I have faced these test failures :   [Thread-3] ERROR
org.apache.curator.test.TestingZooKeeperServer - .  java.net.BindException:
Address already in use

This is caused when there is another instance of Zookeeper running on the
port 2181. If this zookeeper instance is shutdown, then the tests run
sucessfully

Regards,
Shad

On Tue, Apr 17, 2018 at 9:58 AM, Yaniv Rodenski  wrote:

> Hi Nadav,
>
> OK, I did have a closer look this morning on a clean environment and it
> seems that there is something wrong with the build coming out of Travis.
> I suggest we stop the vote for now and Guy and myself will investigate.
>
> Thanks, everyone
> Yaniv
>
> On Tue, Apr 17, 2018 at 4:27 AM, Nadav Har Tzvi 
> wrote:
>
> > -1
> >
> > There are a few issues:
> >
> > 1. Travis doesn't invoke gradlew test, thus the tests don't run at all in
> > the CI environment.
> > 2. When I deployed Amaterasu on both the mesos vagrant box and the HDP
> box,
> > the action tests failed.
> > Here are the stack traces:
> >
> > [Thread-3] ERROR org.apache.curator.test.TestingZooKeeperServer - From
> > testing server (random state: false) for instance:
> > InstanceSpec{dataDirectory=/tmp/1523902177319-0, port=2181, election
> > Port=38827, quorumPort=40951, deleteDataDirectoryOnClose=true,
> serverId=3,
> > tickTime=-1, maxClientCnxns=-1} org.apache.curator.test.InstanceSpec@885
> >
> > java.net.BindException: Address already in use
> >
> >
> > at sun.nio.ch.Net.bind0(Native Method)
> >
> >
> > at sun.nio.ch.Net.bind(Net.java:433)
> >
> >
> > at sun.nio.ch.Net.bind(Net.java:425)
> >
> >
> > at
> > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:
> 223)
> >
> >
> > at
> > sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
> >
> >
> > at
> > sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67)
> >
> >
> > at
> > org.apache.zookeeper.server.NIOServerCnxnFactory.configure(
> > NIOServerCnxnFactory.java:95)
> >
> >
> > at
> > org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(
> > ZooKeeperServerMain.java:111)
> >
> >
> > at
> > org.apache.curator.test.TestingZooKeeperMain.runFromConfig(
> > TestingZooKeeperMain.java:73)
> >
> >
> > at
> > org.apache.curator.test.TestingZooKeeperServer$1.run(
> > TestingZooKeeperServer.java:148)
> >
> >
> > at java.lang.Thread.run(Thread.java:748)
> >
> > org.apache.amaterasu.common.execution.ActionTests *** ABORTED *** (3
> > milliseconds)
> >   java.lang.RuntimeException: Unable to load a Suite class that was
> > discovered in the runpath: org.apache.amaterasu.common.
> > execution.ActionTests
> >   at
> > org.scalatest.tools.DiscoverySuite$.getSuiteInstance(
> > DiscoverySuite.scala:81)
> >   at
> > org.scalatest.tools.DiscoverySuite$$anonfun$1.
> > apply(DiscoverySuite.scala:38)
> >   at
> > org.scalatest.tools.DiscoverySuite$$anonfun$1.
> > apply(DiscoverySuite.scala:37)
> >   at
> > scala.collection.TraversableLike$$anonfun$map$
> > 1.apply(TraversableLike.scala:234)
> >   at
> > scala.collection.TraversableLike$$anonfun$map$
> > 1.apply(TraversableLike.scala:234)
> >   at scala.collection.Iterator$class.foreach(Iterator.scala:891)
> >   at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
> >   at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
> >   at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
> >   at scala.collection.TraversableLike$class.map(
> TraversableLike.scala:234)
> >   ...
> >   Cause: java.net.BindException: Address already in use
> >   at sun.nio.ch.Net.bind0(Native Method)
> >   at sun.nio.ch.Net.bind(Net.java:433)
> >   at sun.nio.ch.Net.bind(Net.java:425)
> >   at
> > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:
> 223)
> >   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
> >   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67)
> >   at
> > org.apache.zookeeper.server.NIOServerCnxnFactory.configure(
> > NIOServerCnxnFactory.java:95)
> >   at
> > org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(
> > ZooKeeperServerMain.java:111)
> >   at
> > org.apache.curator.test.TestingZooKeeperMain.runFromConfig(
> > TestingZooKeeperMain.java:73)
> >
> >   at
> > org.apache.curator.test.TestingZooKeeperServer$1.run(
> > TestingZooKeeperServer.java:148)
> >
> > 3. I received the above error also when testing in the local development
> > environment.
> >
> > Do other committers manage to reproduce this? Eyal? Kirupa?
> >
> >
> >
> >
> > On 16 Apr 2018, at 16:46, Yaniv Rodenski  wrote:
> >
> > Hi everyone,
> >
> > Please review and vote on the release candidate #1 for the version
> > 0.2.0-incubating, as follows:
> >
> > [ ] +1, Approve the release
> >
> > [ ] -1, Do not approve the release (please provide specific comments)
> >
> >
> > The complete staging area is available for your review, which includes:
> >
> > 

[jira] [Commented] (AMATERASU-4) Run an Amaterasu pipeline

2017-10-28 Thread Shad Amez (JIRA)

[ 
https://issues.apache.org/jira/browse/AMATERASU-4?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16223311#comment-16223311
 ] 

Shad Amez commented on AMATERASU-4:
---

Is this project a fork of https://github.com/shintoio/amaterasu ? Can I get 
access to source code/wiki/website to explore and contribute to the project ?

> Run an Amaterasu pipeline
> -
>
> Key: AMATERASU-4
> URL: https://issues.apache.org/jira/browse/AMATERASU-4
> Project: AMATERASU
>  Issue Type: Sub-task
>Reporter: Nadav Har Tzvi
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> The user will invoke "ama run"
> "ama run" will take in the following parameters (based on ama-start.sh):
> -r, --repo = 
> -b, --branch = , the default is "master"
> -e, --env = , this should correspond to a path under  /env 
> directory, e.g. /env/default, /env/test, etc. The default value is "default"
> -n, --name = 
> -i, --job-id = TBD
> -r, --report = 
> Invocation will start Amaterasu on demand.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (AMATERASU-20) Build fails on travis containers

2018-04-09 Thread Shad Amez (JIRA)

[ 
https://issues.apache.org/jira/browse/AMATERASU-20?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16430342#comment-16430342
 ] 

Shad Amez commented on AMATERASU-20:


[~yaniv] I have fixed the build error, which was caused due to incorrect 
handling of data path with upperCase/CamelCase . Let me know if it's ok to send 
the pull request.

> Build fails on travis containers
> 
>
> Key: AMATERASU-20
> URL: https://issues.apache.org/jira/browse/AMATERASU-20
> Project: AMATERASU
>  Issue Type: Bug
>Affects Versions: 0.2.0-incubating
>Reporter: Yaniv Rodenski
>Assignee: Yaniv Rodenski
>Priority: Major
> Fix For: 0.2.0-incubating
>
>
> The build is failing due to absolute paths in some tests



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AMATERASU-21) Fix Spark scala tests

2018-05-15 Thread Shad Amez (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMATERASU-21?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shad Amez reassigned AMATERASU-21:
--

Assignee: Shad Amez

> Fix Spark scala tests
> -
>
> Key: AMATERASU-21
> URL: https://issues.apache.org/jira/browse/AMATERASU-21
> Project: AMATERASU
>  Issue Type: Task
>Affects Versions: 0.2.1-incubating
>Reporter: Yaniv Rodenski
>    Assignee: Shad Amez
>Priority: Major
>
> Spark Scala tests are currently commented out and need to be fixed by being 
> added to the spark test suit and added back in.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)