-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Hi,
I also filed a jira yesterday:
https://issues.apache.org/jira/browse/SPARK-26538
Looks like one needs to be closed as duplicate. Sorry for the late update.
Best regards
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
to get array elements of type decimal(38,18) and no error when
reading in this case.
Should this be considered a bug? Is there a workaround other than changing the
column array type definition to include explicit precision and scale?
Best regards,
Alexey
-- реклама
Hello!
I would like to avoid data checkpointing when processing a DStream. Basically,
we do not care if the intermediate data are lost.
Is there a way to achieve that? Is there an extension point or class embedding
all associated activities?
Thanks!
Sincerely yours,
—
Alexey Kharlamov
Hi Yanbo,
Thanks for your reply. I will keep an eye on that pull request.
For now, I decided to just put my code inside org.apache.spark.ml to be
able to access private classes.
Thanks,
Alexey
On Tue, Aug 16, 2016 at 11:13 PM, Yanbo Liang <yblia...@gmail.com> wrote:
> It seams that
>From my personal experience - we're reading the metadata of the features
column in the dataframe to extract mapping of the feature indices to the
original feature name, and use this mapping to translate the model
coefficients into a JSON string that maps the original feature names to
their
What's the reason for your first cache call? It looks like you've used the
data only once to transform it without reusing the data, so there's no
reason for the first cache call, and you need only the second call (and
that also depends on the rest of your code).
On Thu, Jun 16, 2016 at 3:17 PM,
when view
expires the legth of sliding window….
So my question: does anybody know/have and can share the piece code/ know how:
how to implement “sliding Top N window” better.
If nothing will be offered, I will share what I will do myself.
Thank you
Alexey
This message, including any attachments
Hi
I have simple spark-streaming job(8 executors 1 core - on 8 node cluster) -
read from Kafka topic( 3 brokers with 8 partitions) and save to Cassandra.
The problem is that when I increase number of incoming messages in topic the
job is starting to fail with
Koeninger" <c...@koeninger.org>:
> Can you provide more info (what version of spark, code example)?
>
> On Tue, Sep 8, 2015 at 8:18 AM, Alexey Ponkin <alexey.pon...@ya.ru> wrote:
>> Hi,
>>
>> I have an application with 2 streams, which are joined together.
Hi,
I have an application with 2 streams, which are joined together.
Stream1 - is simple DStream(relativly small size batch chunks)
Stream2 - is a windowed DStream(with duration for example 60 seconds)
Stream1 and Stream2 are Kafka direct stream.
The problem is that according to logs window
Hi,
I have the following code
object MyJob extends org.apache.spark.Logging{
...
val source: DStream[SomeType] ...
source.foreachRDD { rdd =>
logInfo(s"""+++ForEachRDD+++""")
rdd.foreachPartition { partitionOfRecords =>
logInfo(s"""+++ForEachPartition+++""")
}
}
I
writing a Spark streaming application to ingest from Kafka with the
Receiver API and want to create one DStream per physical machine for read
parallelism’s sake. How can I figure out at run time how many machines
there are so I know how many DStreams to create?
--
Best regards, Alexey
what is going on?
Thanks,
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org
--
Alexey Grishchenko, http://0x0fff.com
. And hope in the final result, the
negative ones could be 10 times more than positive ones.
What would be most efficient way to do this?
Thanks,
--
Best regards, Alexey Grishchenko
phone: +353 (87) 262-2154
email: programme...@gmail.com
web: http://0x0fff.com
-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org
--
Best regards, Alexey Grishchenko
phone: +353 (87) 262-2154
email: programme...@gmail.com
web: http://0x0fff.com
java.lang.UnsupportedClassVersionError:
org/apache/maven/cli/MavenCli : Unsupported major.minor version 51.0
Please help how to build the thing.
Thanks
Alexey
This message, including any attachments, is the property of Sears Holdings
Corporation and/or one of its subsidiaries
clean package
Is it only me, who can’t build Spark 1.3?
And, is there any site to download Spark prebuilt for Hadoop 2.5 and Hive?
Thank you for any help.
Alexey
This message, including any attachments, is the property of Sears Holdings
Corporation and/or one of its subsidiaries. It is confidential
somehow. Can you double check that and remove the Scala
classes from your app if they're there?
On Mon, Mar 23, 2015 at 10:07 PM, Alexey Zinoviev
alexey.zinov...@gmail.com wrote:
Thanks Marcelo, this options solved the problem (I'm using 1.3.0), but it
works only if I remove extends Logging from
version3.2.10/version
/dependency
The version is hard coded.
You can rebuild Spark 1.3.0 with json4s 3.2.11
Cheers
On Mon, Mar 23, 2015 at 2:12 PM, Alexey Zinoviev
alexey.zinov...@gmail.com wrote:
Spark has a dependency on json4s 3.2.10, but this version has several
bugs and I need
it with spark-1.3.0/bin/spark-submit --class App1 --conf
spark.driver.userClassPathFirst=true --conf
spark.executor.userClassPathFirst=true
$HOME/projects/sparkapp/target/scala-2.10/sparkapp-assembly-1.0.jar
Thanks,
Alexey
On Tue, Mar 24, 2015 at 5:03 AM, Marcelo Vanzin van...@cloudera.com wrote:
You
usage?
Thanks,
Alexey
I have tried select ceil(2/3), but got key not found: floor
On Tue, Jan 27, 2015 at 11:05 AM, Ted Yu yuzhih...@gmail.com wrote:
Have you tried floor() or ceil() functions ?
According to http://spark.apache.org/sql/, Spark SQL is compatible with
Hive SQL.
Cheers
On Mon, Jan 26, 2015 at
Any ideas? Anyone got the same error?
On Mon, Dec 1, 2014 at 2:37 PM, Alexey Romanchuk alexey.romanc...@gmail.com
wrote:
Hello spark users!
I found lots of strange messages in driver log. Here it is:
2014-12-01 11:54:23,849 [sparkDriver-akka.actor.default-dispatcher-25]
ERROR
Hello spark users!
I found lots of strange messages in driver log. Here it is:
2014-12-01 11:54:23,849 [sparkDriver-akka.actor.default-dispatcher-25]
ERROR
Hello spark users and developers!
I am using hdfs + spark sql + hive schema + parquet as storage format. I
have lot of parquet files - one files fits one hdfs block for one day. The
strange thing is very slow first query for spark sql.
To reproduce situation I use only one core and I have 97sec
the upfront compilation really helps. I doubt it.
However is this almost surely due to caching somewhere, in Spark SQL
or HDFS? I really doubt hotspot makes a difference compared to these
much larger factors.
On Fri, Oct 10, 2014 at 8:49 AM, Alexey Romanchuk
alexey.romanc...@gmail.com wrote
- https://gist.github.com/13h3r/6e5053cf0dbe33f2
Do you have any idea where to look at?
Thanks!
On Fri, Sep 26, 2014 at 10:35 AM, Andrew Ash and...@andrewash.com wrote:
Hi Alexey,
You should see in the logs a locality measure like NODE_LOCAL,
PROCESS_LOCAL, ANY, etc. If your Spark workers
Hello again spark users and developers!
I have standalone spark cluster (1.1.0) and spark sql running on it. My
cluster consists of 4 datanodes and replication factor of files is 3.
I use thrift server to access spark sql and have 1 table with 30+
partitions. When I run query on whole table
30 matches
Mail list logo