?
Regards,
Raghava.
On Wed, Jul 20, 2016 at 2:08 AM, Saurav Sinha <sauravsinh...@gmail.com>
wrote:
> Hi,
>
> I have set driver memory 10 GB and job ran with intermediate failure which
> is recovered back by spark.
>
> But I still what to know if no of parts incre
Thank you. Sure, if I find something I will post it.
Regards,
Raghava.
On Wed, Jun 22, 2016 at 7:43 PM, Nirav Patel <npa...@xactlycorp.com> wrote:
> I believe it would be task, partitions, task status etc information. I do
> not know exact of those things but I had OOM on drive
available limit. So the other options are
1) Separate the driver from master, i.e., run them on two separate nodes
2) Increase the RAM capacity on the driver/master node.
Regards,
Raghava.
On Wed, Jun 22, 2016 at 7:05 PM, Nirav Patel <npa...@xactlycorp.com> wrote:
> Yes driver keeps fa
them to T, i.e., T = T + deltaT
3) Stop when current T size (count) is same as previous T size, i.e.,
deltaT is 0.
Do you think something happens on the driver due to the application logic
and when the partitions are increased?
Regards,
Raghava.
On Wed, Jun 22, 2016 at 12:33 PM, Sonal Goyal
uld be the possible reasons behind the driver-side OOM when the
number of partitions are increased?
Regards,
Raghava.
(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
On Fri, May 13, 2016 at 6:33 AM, Raghava Mutharaju <
m.vijayaragh...@gmail.com> wrote:
> Thank you for the response.
>
> I use
= "org.apache.spark" % "spark-sql_2.11" % "2.0.0-SNAPSHOT"
lazy val root = (project in file(".")).
settings(
name := "sparkel",
version := "0.1.0",
scalaVersion := "2.11.8",
libraryDependencies += spark,
library
of spark version gives sbt error
unresolved dependency: org.apache.spark#spark-core_2.11;2.0.0-SNAPSHOT
I guess this is because the repository doesn't contain 2.0.0-SNAPSHOT. Does
this mean, the only option is to put all the required jars in the lib
folder (unmanaged dependencies)?
Regards,
Raghava.
t that both RDDs are already hash
partitioned.
Regards,
Raghava.
On Tue, May 10, 2016 at 11:44 AM, Rishi Mishra <rmis...@snappydata.io>
wrote:
> As you have same partitioner and number of partitions probably you can use
> zipPartition and provide a user defined function to substract .
>
)))
(3,(16,Some(30)))
(3,(16,Some(16)))
case (x, (y, z)) => Apart from allowing z == None and filtering on y == z,
we also should filter out (3, (16, Some(30))). How can we do that
efficiently without resorting to broadcast of any elements of rdd2?
Regards,
Raghava.
On Mon, May 9, 2016 at 6
.
Regards,
Raghava.
use Spark 1.6.0
We noticed the following
1) persisting an RDD seems to lead to unbalanced distribution of partitions
across the executors.
2) If one RDD has an all-nothing skew then rest of the RDDs that depend on
it also get all-nothing skew.
Regards,
Raghava.
On Wed, Apr 27, 2016 at 10:20 AM
s even (happens when count is moved).
Any pointers in figuring out this issue is much appreciated.
Regards,
Raghava.
On Fri, Apr 22, 2016 at 7:40 PM, Mike Hynes <91m...@gmail.com> wrote:
> Glad to hear that the problem was solvable! I have not seen delays of this
> type for la
Thank you. For now we plan to use spark-shell to submit jobs.
Regards,
Raghava.
On Fri, Apr 22, 2016 at 7:40 PM, Mike Hynes <91m...@gmail.com> wrote:
> Glad to hear that the problem was solvable! I have not seen delays of this
> type for later stages in jobs run by spark-subm
stage also.
Apart from introducing a dummy stage or running it from spark-shell, is
there any other option to fix this?
Regards,
Raghava.
On Mon, Apr 18, 2016 at 12:17 AM, Mike Hynes <91m...@gmail.com> wrote:
> When submitting a job with spark-submit, I've observed delays (up to
>
No. We specify it as a configuration option to the spark-submit. Does that
make a difference?
Regards,
Raghava.
On Mon, Apr 18, 2016 at 9:56 AM, Sonal Goyal <sonalgoy...@gmail.com> wrote:
> Are you specifying your spark master in the scala program?
>
> Best Regards,
> Son
be that all the data is on one node and nothing on
the other and no, the keys are not the same. They vary from 1 to around
55000 (integers). What makes this strange is that it seems to work fine on
the spark shell (REPL).
Regards,
Raghava.
On Mon, Apr 18, 2016 at 1:14 AM, Mike Hynes <91m...@gmail.
size (which is more than adequate now). This behavior is different in
spark-shell and spark scala program.
We are not using YARN, its the stand alone version of Spark.
Regards,
Raghava.
On Mon, Apr 18, 2016 at 12:09 AM, Anuj Kumar <anujs...@gmail.com> wrote:
> Few params like- spark.
tainedJobs and retainedStages has been increased to check them
in the UI.
What information regarding Spark Context would be of interest here?
Regards,
Raghava.
On Sun, Apr 17, 2016 at 10:54 PM, Anuj Kumar <anujs...@gmail.com> wrote:
> If the data file is same then it should have s
s, but this
behavior does not change. This seems strange.
Is there some problem with the way we use HashPartitioner?
Thanks in advance.
Regards,
Raghava.
la?
Does this point to some other issue?
In some other posts, I noticed use of kryo.register(). In this case, how do
we pass the kryo object to SparkContext?
Thanks in advance.
Regards,
Raghava.
(org.apache.spark.sql.types.StructField[].class);
I tried registering
using conf.registerKryoClasses(Array(classOf[StructField[]]))
But StructField[] does not exist. Is there any other way to register it? I
already registered StructField.
Regards,
Raghava.
Thanks a lot Ted.
If the two columns are of different types say Int and Long, then will be
ds.select(expr("_2 / _1").as[(Int, Long)])
Regards,
Raghava.
On Wed, Feb 10, 2016 at 5:19 PM, Ted Yu <yuzhih...@gmail.com> wrote:
> bq. I followed something similar $"
uot;x") == B.toDF().col("y"))
Is there a way to avoid using toDF()?
I am having similar issues with the usage of filter(A.x == B.y)
--
Regards,
Raghava
Ted,
Thank you for the pointer. That works, but what does a string prepended
with $ sign mean? Is it an expression?
Could you also help me with the select() parameter syntax? I followed
something similar $"a.x" and it gives an error message that a TypedColumn
is expected.
Regard
/stages?
Thanks in advance.
Raghava.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/DAG-visualization-no-visualization-information-available-with-history-server-tp26117.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
Hello All,
I am new to Spark and I am trying to understand how iterative application of
operations are handled in Spark. Consider the following program in Scala.
var u = sc.textFile(args(0)+"s1.txt").map(line => {
line.split("\\|") match { case Array(x,y) =>
(y.toInt,x.toInt)}})
27 matches
Mail list logo