Can you cross check your cassandra-rackdc.properties,
cassandra-topology.properties files? It could be a miss configuration. Also
its better you look at the cassandra logs to see whats happening internally.
Thanks
Best Regards
On Fri, Dec 26, 2014 at 7:23 AM, Zhang Jiaqiang
wrote:
> Hello All,
We have a mirror of the user and developer mailing lists on Nabble, but
unfortunately this has led to significant usability issues because users
may attempt to post messages through Nabble which silently fail to get
posted to the actual Apache list and thus are never read by most
subscribers:
http:
Hi,
Say I have created a clustering model using KMeans for 100million
transactions at time t1. I am using streaming and say for every 1 hour i
need to update my existing model. How do I do it. Should it include every
time all the data or can it be incrementally updated.
If I can do an incremental
Hi,
I want to find the time taken for replicating an rdd in spark cluster along
with the computation time on the replicated rdd.
Can someone please suggest some ideas?
Thank you
Hi ,
You can try reducebyKey also ,
Something like this
JavaPairRDD ones = lines
.mapToPair(new PairFunction() {
@Override
public Tuple2 call(String s) {
String[]
Hi,
Hadoop Configuration is only Writable, not Java Serializable. You can use
SerializableWritable (in Spark) to wrap the Configuration to make it
serializable, and use broadcast variable to broadcast this conf to all the
node, then you can use it in mapPartitions, rather than serialize it wit
Hello All,
I'm a newbie to Spark and Cassandra. I try to run the spark demo within
dse-cassandra Portfoliodemo in a cluster env but cannot succeed.
This issue may not really coming from spark, but I am really not sure how
to investigate more on this. Please help me.
There are 5 centos servers in
Hi,
On Fri, Dec 26, 2014 at 10:13 AM, ey-chih chow wrote:
> I should rephrase my question as follows:
>
> How to use the corresponding Hadoop Configuration of a HadoopRDD in
> defining
> a function as an input parameter to the MapPartitions function?
>
Well, you could try to pull the `val confi
Nick,
uh, I would have expected a rather heated discussion, but the opposite
seems to be the case ;-)
Independent of my personal preferences w.r.t. usability, habits etc., I
think it is not good for a software/tool/framework if questions and
discussions are spread over too many places. I guess ev
I should rephrase my question as follows:
How to use the corresponding Hadoop Configuration of a HadoopRDD in defining
a function as an input parameter to the MapPartitions function?
Thanks.
Ey-Chih Chow
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/ser
Hi,
On Fri, Dec 26, 2014 at 1:32 AM, ey-chih chow wrote:
>
> I got some issues with mapPartitions with the following piece of code:
>
> val sessions = sc
> .newAPIHadoopFile(
> "... path to an avro file ...",
> classOf[org.apache.avro.mapreduce.AvroKeyInputFormat[ByteBuf
Hi,
On Fri, Dec 26, 2014 at 5:22 AM, Amit Behera wrote:
>
> How can I do it? Please help me to do.
>
Have you considered using groupByKey?
http://spark.apache.org/docs/latest/programming-guide.html#transformations
Tobias
The following command works
./make-distribution.sh --tgz -Pyarn -Dyarn.version=2.6.0 -Phadoop-2.4
-Dhadoop.version=2.6.0 -Phive -DskipTests
-- Original --
From: "guxiaobo1982";;
Send time: Thursday, Dec 25, 2014 3:58 PM
To: ""; "Ted Yu";
Cc: "user@spark.apac
Hello all - can anyone please offer any advice on this issue?
-Ilya Ganelin
On Mon, Dec 22, 2014 at 5:36 PM, Ganelin, Ilya
wrote:
> Hi all, I have a long running job iterating over a huge dataset. Parts of
> this operation are cached. Since the job runs for so long, eventually the
> overhead of
Hi Users,
I am reading a csv file and my data format is like :
key1,value1
key1,value2
key1,value1
key1,value3
key2,value1
key2,value5
key2,value5
key2,value4
key1,value4
key1,value4
key3,value1
key3,value1
key3,value2
required output :
key1:[value1,value2,value1,value3,value4,value4]
key2:[val
Sorry for the typo.
Apache Hadoop version is 2.6.0
Regards,
Sam
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/ReliableDeliverySupervisor-Association-with-remote-system-tp20859p20860.html
Sent from the Apache Spark User List mailing list archive at Nabbl
Hi All,
I am new to both Scala & Spark, so please expect some mistakes.
Setup :
Scala : 2.10.2
Spark : Apache 1.1.0
Hadoop : Apache 2.4
Intend of the code : To read from kafka topic & do some processing.
Below are the code details and error am getting. :
import org.apache.spark._
import
Thanks. I marked the variable as transient and i moved ahead now i am getting
exception in execution the query.
final static transient SparkConf sparkConf = new
SparkConf().setAppName("NumberCount");final static transient
JavaSparkContext jc = new JavaSparkContext(sparkConf);static
Spark 1.2.0 is SO much more usable than previous releases -- many thanks to
the team for this release.
A question about progress of actions. I can see how things are progressing
using the Spark UI. I can also see the nice ASCII art animation on the
spark driver console.
Has anyone come up with
Hi,
I got some issues with mapPartitions with the following piece of code:
val sessions = sc
.newAPIHadoopFile(
"... path to an avro file ...",
classOf[org.apache.avro.mapreduce.AvroKeyInputFormat[ByteBuffer]],
classOf[AvroKey[ByteBuffer]],
classOf[NullWr
Nice idea, although it needs a plan on their hosting, or spark to host it
if I'm not wrong.
I've been using Slack for discussions, it's not exactly the same of
discourse, the ML or SO but offers interesting features.
It's more in the mood of IRC integrated with external services.
my2c
On Wed Dec
Hi Guys,
I found an excetpion while running application using 1.2.0-snapshot version.
It shows like this:
2014-12-23 07:45:36,333 | ERROR | [Executor task launch worker-0] |
Exception in task 0.0 in stage 0.0 (TID 0) |
org.apache.spark.Logging$class.logError(Logging.scala:96)
java.io.StreamCorru
Hi, I think a modeling tool may be helpful because sometimes it's
hard/tricky to program Spark. I don't know if there is already such a
tool.
Thanks!
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional comm
Hi Kevin,
Were you able to build spark with command "export MAVEN_OPTS="-Xmx2g
-XX:MaxPermSize=512M -XX:ReservedCodeCacheSize=512m" && mvn -Pdeb
-DskipTests clean package" ?
I am getting the below error for all versions of spark(even 1.2.0):
Failed to execute goal org.vafer:jdeb:0.11:jdeb (defaul
I’ve created a jira issue for this
https://issues.apache.org/jira/browse/SPARK-4967
Originally we want to support multiple parquet file paths scanning as I guess,
and those file paths are in a single string separated by comma internally,
however I didn’t find any public example says we support
25 matches
Mail list logo