date:20150113

[jira] [Closed] (SPARK-5232) CombineFileInputFormatShim#getDirIndices is expensive

2015-01-13 Thread Jimmy Xiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-5232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang closed SPARK-5232.
--
Resolution: Invalid

Wrong project.

 CombineFileInputFormatShim#getDirIndices is expensive
 -

 Key: SPARK-5232
 URL: https://issues.apache.org/jira/browse/SPARK-5232
 Project: Spark
  Issue Type: Improvement
Reporter: Jimmy Xiang

 [~lirui] found out that we spent quite some time on 
 CombineFileInputFormatShim#getDirIndices. Looked into it and it seems to me 
 we should be able to get rid of this method completely if we can enhance 
 CombineFileInputFormatShim a little.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-5123) Stabilize Spark SQL data type API

2015-01-13 Thread Reynold Xin (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-5123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reynold Xin resolved SPARK-5123.

   Resolution: Fixed
Fix Version/s: 1.3.0

 Stabilize Spark SQL data type API
 -

 Key: SPARK-5123
 URL: https://issues.apache.org/jira/browse/SPARK-5123
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Reynold Xin
Assignee: Reynold Xin
 Fix For: 1.3.0


 Having two versions of the data type APIs (one for Java, one for Scala) 
 requires downstream libraries to also have two versions of the APIs if the 
 library wants to support both Java and Scala. I took a look at the Scala 
 version of the data type APIs - it can actually work out pretty well for Java 
 out of the box. 
 The proposal is to move Spark SQL data type definitions from 
 org.apache.spark.sql.catalyst.types into org.apache.spark.sql.types, and make 
 the existing Scala type API usable in Java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-5220) keepPushingBlocks in BlockGenerator terminated when an exception occurs, which causes the block pushing thread to terminate and blocks receiver

2015-01-13 Thread Saisai Shao (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14276350#comment-14276350
 ] 

Saisai Shao commented on SPARK-5220:


Hi Max, as I said in the mail, this is an expected behavior of receiver and 
block generator because of locking mechanism of BlockGenerator. The receiver 
will block on the locks for adding data into BlockGenerator, and the 
BlockGenerator is waiting for pushing thread to put data into HDFS and BM. 
Because of unmatched speed, it is expected from my understanding.

 keepPushingBlocks in BlockGenerator terminated when an exception occurs, 
 which causes the block pushing thread to terminate and blocks receiver  
 -

 Key: SPARK-5220
 URL: https://issues.apache.org/jira/browse/SPARK-5220
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Affects Versions: 1.2.0
Reporter: Max Xu

 I am running a Spark streaming application with ReliableKafkaReceiver. It 
 uses BlockGenerator to push blocks to BlockManager. However, writing WALs to 
 HDFS may time out that causes keepPushingBlocks in BlockGenerator to 
 terminate.
 15/01/12 19:07:06 ERROR receiver.BlockGenerator: Error in block pushing thread
 java.util.concurrent.TimeoutException: Futures timed out after [30 seconds]
 at 
 scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
 at 
 scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
 at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
 at 
 scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
 at scala.concurrent.Await$.result(package.scala:107)
 at 
 org.apache.spark.streaming.receiver.WriteAheadLogBasedBlockHandler.storeBlock(ReceivedBlockHandler.scala:176)
 at 
 org.apache.spark.streaming.receiver.ReceiverSupervisorImpl.pushAndReportBlock(ReceiverSupervisorImpl.scala:160)
 at 
 org.apache.spark.streaming.receiver.ReceiverSupervisorImpl.pushArrayBuffer(ReceiverSupervisorImpl.scala:126)
 at 
 org.apache.spark.streaming.receiver.Receiver.store(Receiver.scala:124)
 at 
 org.apache.spark.streaming.kafka.ReliableKafkaReceiver.org$apache$spark$streaming$kafka$ReliableKafkaReceiver$$storeBlockAndCommitOffset(ReliableKafkaReceiver.scala:207)
 at 
 org.apache.spark.streaming.kafka.ReliableKafkaReceiver$GeneratedBlockHandler.onPushBlock(ReliableKafkaReceiver.scala:275)
 at 
 org.apache.spark.streaming.receiver.BlockGenerator.pushBlock(BlockGenerator.scala:181)
 at 
 org.apache.spark.streaming.receiver.BlockGenerator.org$apache$spark$streaming$receiver$BlockGenerator$$keepPushingBlocks(BlockGenerator.scala:154)
 at 
 org.apache.spark.streaming.receiver.BlockGenerator$$anon$1.run(BlockGenerator.scala:86)
 Then the block pushing thread is done and no subsequent blocks can be pushed 
 into blockManager. In turn this blocks receiver from receiving new data.
 So when running my app and the TimeoutException happens, the 
 ReliableKafkaReceiver stays in ACTIVE status but doesn't do anything at all. 
 The application rogues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-4894) Add Bernoulli-variant of Naive Bayes

2015-01-13 Thread RJ Nowling (JIRA)

[
https://issues.apache.org/jira/browse/SPARK-4894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14276380#comment-14276380
]

RJ Nowling commented on SPARK-4894:
---

Hi @lmcguire,

Always happy to have more help! :)

I started looking through the Spark NB functions but I haven't started writing
code yet. The docs for NB mention that using binary features will cause the
multinomial NB to act like Bernoulli NB. I don't believe the documentation is
correct, at least when smoothing is used since P(0) != 1 - P(1).I was
planning on comparing the sklearn implementation with the Spark implementation
and showing that the docs were wrong. Once verified, I think the changes will
be very small to add a Bernoulli mode controlled by a flag in the constructor.

I won't get to this until next week, though. If you have time now and want to
tackle this, I'd be happy to hand it over to you and review any patches. (I'm
not a committer, though -- [~mengxr] would have to sign off.)Otherwise, if
you want to wait until I have a patch and test it, that could work, too. What
do you think?

Add Bernoulli-variant of Naive Bayes

Key: SPARK-4894
URL: https://issues.apache.org/jira/browse/SPARK-4894
Project: Spark
Issue Type: New Feature
Components: MLlib
Affects Versions: 1.1.1
Reporter: RJ Nowling

MLlib only supports the multinomial-variant of Naive Bayes. The Bernoulli
version of Naive Bayes is more useful for situations where the features are
binary values.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-1805) Error launching cluster when master and slaves machines are of different visualization types

2015-01-13 Thread Nicholas Chammas (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-1805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Chammas updated SPARK-1805:

Issue Type: Bug  (was: Improvement)

 Error launching cluster when master and slaves machines are of different 
 visualization types
 

 Key: SPARK-1805
 URL: https://issues.apache.org/jira/browse/SPARK-1805
 Project: Spark
  Issue Type: Bug
  Components: EC2
Affects Versions: 0.9.0, 0.9.1, 1.0.0
Reporter: Han JU
Priority: Minor

 In current EC2 script, the AMI image object is loaded only once. This is ok 
 when master and slave machines are of the same visualization type (pvm or 
 hvm). But this won't work if, say, master is pvm and slaves are hvm since the 
 AMI is not compatible between these two kinds of visualization. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-1805) Error launching cluster when master and slave machines are of different virtualization types

2015-01-13 Thread Nicholas Chammas (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-1805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Chammas updated SPARK-1805:

 Description: In the current EC2 script, the AMI image object is loaded 
only once. This is OK when the master and slave machines are of the same 
virtualization type (pv or hvm). But this won't work if, say, the master is pv 
and the slaves are hvm since the AMI is not compatible across these two kinds 
of virtualization.  (was: In current EC2 script, the AMI image object is loaded 
only once. This is ok when master and slave machines are of the same 
visualization type (pvm or hvm). But this won't work if, say, master is pvm and 
slaves are hvm since the AMI is not compatible between these two kinds of 
visualization. )
Target Version/s: 1.3.0
 Summary: Error launching cluster when master and slave machines 
are of different virtualization types  (was: Error launching cluster when 
master and slaves machines are of different visualization types)

 Error launching cluster when master and slave machines are of different 
 virtualization types
 

 Key: SPARK-1805
 URL: https://issues.apache.org/jira/browse/SPARK-1805
 Project: Spark
  Issue Type: Bug
  Components: EC2
Affects Versions: 0.9.0, 0.9.1, 1.0.0, 1.1.1, 1.2.0
Reporter: Han JU
Priority: Minor

 In the current EC2 script, the AMI image object is loaded only once. This is 
 OK when the master and slave machines are of the same virtualization type (pv 
 or hvm). But this won't work if, say, the master is pv and the slaves are hvm 
 since the AMI is not compatible across these two kinds of virtualization.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-5235) java.io.NotSerializableException: org.apache.spark.sql.SQLConf

2015-01-13 Thread Alex Baretta (JIRA)

Alex Baretta created SPARK-5235:
---

 Summary: java.io.NotSerializableException: 
org.apache.spark.sql.SQLConf
 Key: SPARK-5235
 URL: https://issues.apache.org/jira/browse/SPARK-5235
 Project: Spark
  Issue Type: Bug
Reporter: Alex Baretta


The SQLConf field in SQLContext is neither Serializable nor transient. Here's 
the stack trace I get when running SQL queries against a Parquet file.

Exception in thread Thread-43 org.apache.spark.SparkException: Job aborted 
due to stage failure: Task not serializable: java.io.NotSerializableException: 
org.apache.spark.sql.SQLConf
at 
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1195)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1184)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1183)
at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at 
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1183)
at 
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitMissingTasks(DAGScheduler.scala:843)
at 
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:779)
at 
org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:763)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1364)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessActor.aroundReceive(DAGScheduler.scala:1356)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
at akka.dispatch.Mailbox.run(Mailbox.scala:220)
at 
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-3678) Yarn app name reported in RM is different between cluster and client mode

2015-01-13 Thread WangTaoTheTonic (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-3678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14276429#comment-14276429
 ] 

WangTaoTheTonic commented on SPARK-3678:


In SparkHdfsLR there has {quote}val sparkConf = new 
SparkConf().setAppName(SparkHdfsLR){quote}.

And in client mode, the register to yarn happens in YarnClientSchedulerBackend, 
which is after the setAppName above.
While in cluster mode, the register happens in yarn.Client, which is before 
setAppName above.

So it is the register sequence that makes the difference.

 Yarn app name reported in RM is different between cluster and client mode
 -

 Key: SPARK-3678
 URL: https://issues.apache.org/jira/browse/SPARK-3678
 Project: Spark
  Issue Type: Bug
  Components: YARN
Affects Versions: 1.1.0
Reporter: Thomas Graves

 If you launch an application in yarn cluster mode the name of the application 
 in the ResourceManager generally shows up as the full name 
 org.apache.spark.examples.SparkHdfsLR.  If you start the same app in client 
 mode it shows up as SparkHdfsLR.
 We should be consistent between them.  
 I haven't looked at it in detail, perhaps its only the examples but I think 
 I've seen this with customer apps also.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-5234) examples for ml don't have sparkContext.stop

2015-01-13 Thread yuhao yang (JIRA)

yuhao yang created SPARK-5234:
-

 Summary: examples for ml don't have sparkContext.stop
 Key: SPARK-5234
 URL: https://issues.apache.org/jira/browse/SPARK-5234
 Project: Spark
  Issue Type: Improvement
  Components: ML
Affects Versions: 1.2.0
 Environment: all
Reporter: yuhao yang
Priority: Trivial
 Fix For: 1.3.0


Not sure why sc.stop() is not in the 
org.apache.spark.examples.ml {CrossValidatorExample, SimpleParamsExample, 
SimpleTextClassificationPipeline}. 

I can prepare a PR if it's not intentional to omit the call to stop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-3821) Develop an automated way of creating Spark images (AMI, Docker, and others)

2015-01-13 Thread Nicholas Chammas (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-3821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14276471#comment-14276471
 ] 

Nicholas Chammas commented on SPARK-3821:
-

[~shivaram] Are we ready to open a PR against {{mesos/spark-ec2}} and start a 
review discussion there?

 Develop an automated way of creating Spark images (AMI, Docker, and others)
 ---

 Key: SPARK-3821
 URL: https://issues.apache.org/jira/browse/SPARK-3821
 Project: Spark
  Issue Type: Improvement
  Components: Build, EC2
Reporter: Nicholas Chammas
Assignee: Nicholas Chammas
 Attachments: packer-proposal.html


 Right now the creation of Spark AMIs or Docker containers is done manually. 
 With tools like [Packer|http://www.packer.io/], we should be able to automate 
 this work, and do so in such a way that multiple types of machine images can 
 be created from a single template.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-3185) SPARK launch on Hadoop 2 in EC2 throws Tachyon exception when Formatting JOURNAL_FOLDER

2015-01-13 Thread Florian Verhein (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-3185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14276436#comment-14276436
 ] 

Florian Verhein commented on SPARK-3185:


I'm also getting this, though with Server IPC version 9 now that I'm using 
hadoop 2.4.1 (modification of the various hadoop init.sh scripts). I'm also 
using spark 1.2.0.

My understanding is that spark-1.2.0-bin-hadoop2.4.tgz is built against hadoop 
2.4 and tachyon 0.4.1. 
But I suspect the tachyon 0.4.1 that is installed in the spark-ec2 scripts is 
built against hadoop 1...

Does this mean building tachyon against hadoop 2.4.1 would fix this?

 SPARK launch on Hadoop 2 in EC2 throws Tachyon exception when Formatting 
 JOURNAL_FOLDER
 ---

 Key: SPARK-3185
 URL: https://issues.apache.org/jira/browse/SPARK-3185
 Project: Spark
  Issue Type: Bug
Affects Versions: 1.0.2
 Environment: Amazon Linux AMI
 [ec2-user@ip-172-30-1-145 ~]$ uname -a
 Linux ip-172-30-1-145 3.10.42-52.145.amzn1.x86_64 #1 SMP Tue Jun 10 23:46:43 
 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
 https://aws.amazon.com/amazon-linux-ami/2014.03-release-notes/
 The build I used (and MD5 verified):
 [ec2-user@ip-172-30-1-145 ~]$ wget 
 http://supergsego.com/apache/spark/spark-1.0.2/spark-1.0.2-bin-hadoop2.tgz
Reporter: Jeremy Chambers

 {code}
 org.apache.hadoop.ipc.RemoteException: Server IPC version 7 cannot 
 communicate with client version 4
 {code}
 When I launch SPARK 1.0.2 on Hadoop 2 in a new EC2 cluster, the above tachyon 
 exception is thrown when Formatting JOURNAL_FOLDER.
 No exception occurs when I launch on Hadoop 1.
 Launch used:
 {code}
 ./spark-ec2 -k spark_cluster -i /home/ec2-user/kagi/spark_cluster.ppk 
 --zone=us-east-1a --hadoop-major-version=2 --spot-price=0.0165 -s 3 launch 
 sparkProd
 {code}
 {code}
 log snippet
 Formatting Tachyon Master @ ec2-54-80-49-244.compute-1.amazonaws.com
 Formatting JOURNAL_FOLDER: /root/tachyon/libexec/../journal/
 Exception in thread main java.lang.RuntimeException: 
 org.apache.hadoop.ipc.RemoteException: Server IPC version 7 cannot 
 communicate with client version 4
 at tachyon.util.CommonUtils.runtimeException(CommonUtils.java:246)
 at tachyon.UnderFileSystemHdfs.init(UnderFileSystemHdfs.java:73)
 at tachyon.UnderFileSystemHdfs.getClient(UnderFileSystemHdfs.java:53)
 at tachyon.UnderFileSystem.get(UnderFileSystem.java:53)
 at tachyon.Format.main(Format.java:54)
 Caused by: org.apache.hadoop.ipc.RemoteException: Server IPC version 7 cannot 
 communicate with client version 4
 at org.apache.hadoop.ipc.Client.call(Client.java:1070)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
 at com.sun.proxy.$Proxy1.getProtocolVersion(Unknown Source)
 at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
 at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
 at 
 org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
 at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:238)
 at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:203)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
 at 
 org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
 at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
 at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
 at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
 at tachyon.UnderFileSystemHdfs.init(UnderFileSystemHdfs.java:69)
 ... 3 more
 Killed 0 processes
 Killed 0 processes
 ec2-54-167-219-159.compute-1.amazonaws.com: Killed 0 processes
 ec2-54-198-198-17.compute-1.amazonaws.com: Killed 0 processes
 ec2-54-166-36-0.compute-1.amazonaws.com: Killed 0 processes
 ---end snippet---
 {code}
 *I don't have this problem when I launch without the 
 --hadoop-major-version=2 (which defaults to Hadoop 1.x).*



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-3821) Develop an automated way of creating Spark images (AMI, Docker, and others)

2015-01-13 Thread Shivaram Venkataraman (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-3821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14276505#comment-14276505
 ] 

Shivaram Venkataraman commented on SPARK-3821:
--

[~nchammas] Yes -- That sounds good

 Develop an automated way of creating Spark images (AMI, Docker, and others)
 ---

 Key: SPARK-3821
 URL: https://issues.apache.org/jira/browse/SPARK-3821
 Project: Spark
  Issue Type: Improvement
  Components: Build, EC2
Reporter: Nicholas Chammas
Assignee: Nicholas Chammas
 Attachments: packer-proposal.html


 Right now the creation of Spark AMIs or Docker containers is done manually. 
 With tools like [Packer|http://www.packer.io/], we should be able to automate 
 this work, and do so in such a way that multiple types of machine images can 
 be created from a single template.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

1 2 >

1 - 100 of 115 matches

Mail list logo