47:25, "lk_spark" wrote:
hi,all :
I have a sql statement wich can be run on spark 3.2.1 but not on spark 3.3.1 .
when I try to explain it, will got error with message:
org.apache.spark.sql.catalyst.expressions.Literal cannot be cast to
org.apache.spark.sql.catalyst.expressions.AnsiCast
hi,all :
I have a sql statement wich can be run on spark 3.2.1 but not on spark 3.3.1 .
when I try to explain it, will got error with message:
org.apache.spark.sql.catalyst.expressions.Literal cannot be cast to
org.apache.spark.sql.catalyst.expressions.AnsiCast
execute the sql, error stack is
hi,all :
when I try to merge a iceberg table by spark , I can see faild job on
spark ui , but the spark application final state is SUCCEEDED.
I submit an issue : https://github.com/apache/iceberg/issues/5876
I wonder to know is this a real error ? thanks .
sorry, it's my env problem.
At 2022-03-21 14:00:01, "lk_spark" wrote:
hi, all :
I got a strange error:
bin/spark-shell --deploy-mode client
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use
set
hi, all :
I got a strange error:
bin/spark-shell --deploy-mode client
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use
setLogLevel(newLevel).
22/03/21 13:51:39 WARN util.Utils: spark.executor.instances less than
spark.dynamicAllocation.
hi,all :
I'm using spark2.4, I try to use multi thread to use sparkcontext , I found
a example :
https://hadoopist.wordpress.com/2017/02/03/how-to-use-threads-in-spark-job-to-achieve-parallel-read-and-writes/
some code like this :
for (a <- 0 until 4) {
val thread = new Thread {
I found _sqlContext is null , how to resolve it ?
2019-11-25
lk_spark
发件人:"lk_spark"
发送时间:2019-11-25 16:00
主题:how spark structrued stream write to kudu
收件人:"user.spark"
抄送:
hi,all:
I'm using spark 2.4.4 to readstream data from kafka and want to write to
CstoreNew2KUDU$$anon$1.process(CstoreNew2KUDU.scala:122)
...
and SQLImplicits.scala:228 is :
227: implicit def localSeqToDatasetHolder[T : Encoder](s: Seq[T]):
DatasetHolder[T] = {
228:DatasetHolder(_sqlContext.createDataset(s))
229: }
can anyone give me some help?
2019-11-25
lk_spark
hi,all:
I have a hive table STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' , many files of it is very
small , when I use spark to read it , thousands tasks will start , how can I
limit the task num ?
2019-11-12
lk_spark
I want to parse the Struct of data dynamically , then write data to delta lake
, I think it can automatically merge scheme.
2019-09-17
lk_spark
发件人:Tathagata Das
发送时间:2019-09-17 16:13
主题:Re: how can I dynamic parse json in kafka when using Structured Streaming
收件人:"lk_spar
implicit evidence$6:
org.apache.spark.sql.Encoder[org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema])org.apache.spark.sql.Dataset[org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema].
Unspecified value parameter evidence$6.
val words = lines.map(line => {
2019-09-17
lk_spark
lk_spark
sorry, now what I can do is like this :
var df5 = spark.read.parquet("/user/devuser/testdata/df1").coalesce(1)
df5 = df5.union(df5).union(df5).union(df5).union(df5)
2018-12-14
lk_spark
发件人:15313776907 <15313776...@163.com>
发送时间:2018-12-14 16:39
主题:Re: how to generat
generate some data in Spark .
2018-12-14
lk_spark
发件人:Jean Georges Perrin
发送时间:2018-12-14 11:10
主题:Re: how to generate a larg dataset paralleled
收件人:"lk_spark"
抄送:"user.spark"
You just want to generate some data in Spark or ingest a large dataset outside
of Spark?
cluster.
2018-12-14
lk_spark
le have 5760749 rows data.
after run about 10 times , the Driver physical memory will beyond 4.5GB and
killed by yarn.
I saw the old generation memory keep growing and can not release by gc.
2018-11-12
lk_spark
发件人:"lk_hadoop"
发送时间:2018-11-12 09:37
主题:about LIVY-424
收件人:"user
resolved. need to add "kubernetes.default.svc" to k8s api server TLS config.
2018-04-08
lk_spark
发件人:"lk_spark"
发送时间:2018-04-08 11:15
主题:spark2.3 on kubernets
收件人:"user"
抄送:
hi,all:
I am trying spark on k8s with Pi sample.
I got error with driver
spark-examples_2.11-2.3.0.jar
2018-04-08
lk_spark
thank you Kumar , I will try it later.
2017-06-22
lk_spark
发件人:Pralabh Kumar
发送时间:2017-06-22 20:20
主题:Re: Re: spark2.1 kafka0.10
收件人:"lk_spark"
抄送:"user.spark"
It looks like your replicas for partition are getting failed. If u have more
brokers , can u try increasin
each topic have 5 partition , 2 replicas .
2017-06-22
lk_spark
发件人:Pralabh Kumar
发送时间:2017-06-22 17:23
主题:Re: spark2.1 kafka0.10
收件人:"lk_spark"
抄送:"user.spark"
How many replicas ,you have for this topic .
On Thu, Jun 22, 2017 at 9:19
org.apache.spark.streaming.scheduler.JobGenerator$$anon$1.onReceive(JobGenerator.scala:88)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
2017-06-22
lk_spark
发件人:"lk_spark"
发送时间:2017-06-22 11:13
主题:spark2.1 kafka0.10
收件人:"user.spark"
抄送:
hi,all:
when I run stream application for a few minut
ERROR JobScheduler: Error generating jobs for time
1498098896000 ms
java.lang.IllegalStateException: No current assignment for partition pages-2
I don't know why ?
2017-06-22
lk_spark
hi,all :
https://issues.apache.org/jira/browse/SPARK-19680
is this issue have any method to patch it ? I met the same problem.
2017-06-20
lk_spark
thanks Kumar , that really helpful !!
2017-06-16
lk_spark
发件人:Pralabh Kumar
发送时间:2017-06-16 18:30
主题:Re: Re: how to call udf with parameters
收件人:"lk_spark"
抄送:"user.spark"
val getlength=udf((idx1:Int,idx2:Int, data : String)=>
data.substring(idx1,idx2))
data
thanks Kumar , I want to know how to cao udf with multiple parameters , maybe
an udf to make a substr function,how can I pass parameter with begin and end
index ? I try it with errors. Does the udf parameters could only be a column
type?
2017-06-16
lk_spark
发件人:Pralabh Kumar
发送时间:2017
)
org.apache.spark.sql.AnalysisException: cannot resolve '`true`' given input
columns: [id, text];;
'Project [UDF(text#6, 'true, 'true, '2) AS words#16]
+- Project [_1#2 AS id#5, _2#3 AS text#6]
+- LocalRelation [_1#2, _2#3]
I need help!!
2017-06-16
lk_spark
ybody give me some clue?
2017-05-15
lk_spark
Tank you , that's what I want to confirm.
2017-03-16
lk_spark
发件人:Yuhao Yang
发送时间:2017-03-16 13:05
主题:Re: Re: how to call recommend method from ml.recommendation.ALS
收件人:"lk_spark"
抄送:"任弘迪","user.spark"
This is something that was just added to ML and
thanks for your reply , what I exactly want to know is :
in package mllib.recommendation , MatrixFactorizationModel have method like
recommendProducts , but I didn't find it in package ml.recommendation.
how can I do the samething as mllib when I use ml.
2017-03-16
lk_spark
发件人:任弘迪
hi,all:
under spark2.0 ,I wonder to know after trained a
ml.recommendation.ALSModel how I can do the recommend action?
I try to save the model and load it by MatrixFactorizationModel but got
error.
2017-03-16
lk_spark
value().matches("\\d{4}.*")).map(record => {
val assembly = record.topic()
val value = record.value
val datatime = value.substring(0, 22)
val level = value.substring(24, 27)
(assembly,value,datatime,level)
})
how can I pass parameter to the map function.
2017-02-27
lk_spark
in
120 seconds
... 8 more
17/01/20 06:39:05 ERROR CoarseGrainedExecutorBackend: Driver
192.168.0.136:51197 disassociated! Shutting down.
2017-01-20
lk_spark
01-18
lk_spark
2017-01-17
lk_spark
} else {
ab += attributes(i)
}
}
new GenericRow(ab.toArray)
}
}
2017-01-13
lk_spark
发件人:"lk_spark"
发送时间:2017-01-13 09:49
主题:Re: Re: Re: how to change datatype by useing StructType
收件人:"Nicholas Hakobian"
抄送:"user.sp
Thank you Nicholas , if the sourcedata was csv format ,CSV reader works well.
2017-01-13
lk_spark
发件人:Nicholas Hakobian
发送时间:2017-01-13 08:35
主题:Re: Re: Re: how to change datatype by useing StructType
收件人:"lk_spark"
抄送:"ayan guha","user.spark"
Have you t
nsafeProjection.apply_0$(Unknown
Source)
at
org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown
Source)
at
org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.toRow(ExpressionEncoder.scala:290)
all the file was Any, what should I do?
2017-01-12
lk_spark
发件人:"lk_sp
yes, field year is in my data:
data:
kevin,30,2016
shen,30,2016
kai,33,2016
wei,30,2016
this will not work
val rowRDD = peopleRDD.map(_.split(",")).map(attributes =>
Row(attributes(0),attributes(1),attributes(2)))
but I need read data by configurable.
2017-01-12
lk_
level row object), 0, name), StringType),
true)
if I change my code it will work:
val rowRDD = peopleRDD.map(_.split(",")).map(attributes =>
Row(attributes(0),attributes(1).toInt)
but this is not a good idea .
2017-01-12
lk_spark
...|
|MzIzMjQ4NzQwOA==|http://mp.weixin|
|MzAwOTIxMTcyMQ==|http://mp.weixin|
|MzA3OTAyNzY2OQ==|http://mp.weixin|
|MjM5NDAzMDAwMA==|http://mp.weixin|
|MzAwMjE4MzU0Nw==|http://mp.weixin....|
|MzA4NzcyNjI0Mw==|http://mp.weixin|
|MzI5OTE5Nzc5Ng==|http://mp.weixin|
2016-12-06
lk
thanks for reply. I will search how to use na.fill . and I don't know how to
get the value of the column and do some operation like substr or split.
2016-12-06
lk_spark
发件人:Pankaj Wahane
发送时间:2016-12-06 17:39
主题:Re: how to add colum to dataframe
收件人:"lk_spark","user
|
| null|http://mp.weixin|
| null|http://mp.weixin|
| null|http://mp.weixin|
| null|http://mp.weixin|
| null|http://mp.weixin|
Why what I got is null?
2016-12-06
lk_spark
e of input (if you try to input this parquet).
Again, the important question is – Why do you need it to be one file? Are you
planning to use it externally? If yes, can you not use fragmented files there?
If the data is too big for the Spark executor, it’ll most certainly be too much
-10 15:11
/parquetdata/weixin/biztags/biztag2/part-r-00176-0f61afe4-23e8-40bb-b30b-09652ca677bc
more an more...
2016-11-10
lk_spark
: string (nullable = true)
2016-10-21
lk_spark
发件人:颜发才(Yan Facai)
发送时间:2016-10-21 15:35
主题:Re: How to iterate the element of an array in DataFrame?
收件人:"user.spark"
抄送:
I don't know how to construct `array>`.
Could anyone help me?
I try to get the array by :
scala> mb
sh the metadata.
spark doesn't recognize the data in subdir. How I can do it ?
2016-10-20
lk_spark
Thank you, all of you. explode() is helpful:
df.selectExpr("explode(bizs) as e").select("e.*").show()
2016-10-19
lk_spark
发件人:Hyukjin Kwon
发送时间:2016-10-19 13:16
主题:Re: how to extract arraytype data to file
收件人:"Divya Gehlot"
抄送:"lk_spark"
code|
+++
|[4938200, 4938201...|[罗甸网警, 室内设计师杨焰红, ...|
|[4938300, 4938301...|[SDCS十全九美, 旅梦长大, ...|
|[4938400, 4938401...|[日重重工液压行走回转, 氧老家,...|
|[4938500, 4938501...|[PABXSLZ, 陈少燕, 笑蜜...|
|[4938600, 4938601...|[税海微云, 西域美农云家店, 福...|
+++
what I want is I can read colum in normal row type. how I can do it ?
2016-10-19
lk_spark
48 matches
Mail list logo