[jira] [Created] (SPARK-21601) Modify the JDK version of the Maven compilation

2017-08-01 Thread jifei_yang (JIRA)
jifei_yang created SPARK-21601:
--

 Summary: Modify the JDK version of the Maven compilation
 Key: SPARK-21601
 URL: https://issues.apache.org/jira/browse/SPARK-21601
 Project: Spark
  Issue Type: Improvement
  Components: Build
Affects Versions: 2.2.0
Reporter: jifei_yang
Priority: Minor
 Fix For: 2.2.0






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Closed] (SPARK-21601) Modify the JDK version of the Maven compilation

2017-08-01 Thread jifei_yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-21601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jifei_yang closed SPARK-21601.
--
Resolution: Won't Do

> Modify the JDK version of the Maven compilation
> ---
>
> Key: SPARK-21601
> URL: https://issues.apache.org/jira/browse/SPARK-21601
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 2.2.0
>Reporter: jifei_yang
>Priority: Minor
> Fix For: 2.2.0
>
>
> When using maven to compile spark, I want to add a modified jdk property. 
> This is user-friendly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21601) Modify the JDK version of the Maven compilation

2017-08-01 Thread jifei_yang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110250#comment-16110250
 ] 

jifei_yang commented on SPARK-21601:


Just learned that the spark team is now using maven-compiler-plugin to compile 
java 
code.[https://github.com/apache/spark/commit/74cda94c5e496e29f42f1044aab90cab7dbe9d38]

> Modify the JDK version of the Maven compilation
> ---
>
> Key: SPARK-21601
> URL: https://issues.apache.org/jira/browse/SPARK-21601
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 2.2.0
>Reporter: jifei_yang
>Priority: Minor
> Fix For: 2.2.0
>
>
> When using maven to compile spark, I want to add a modified jdk property. 
> This is user-friendly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21601) Modify the JDK version of the Maven compilation

2017-08-01 Thread jifei_yang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110204#comment-16110204
 ] 

jifei_yang commented on SPARK-21601:


As follows,
1.8
1.8
1.8

> Modify the JDK version of the Maven compilation
> ---
>
> Key: SPARK-21601
> URL: https://issues.apache.org/jira/browse/SPARK-21601
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 2.2.0
>Reporter: jifei_yang
>Priority: Minor
> Fix For: 2.2.0
>
>
> When using maven to compile spark, I want to add a modified jdk property. 
> This is user-friendly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21601) Modify the JDK version of the Maven compilation

2017-08-01 Thread jifei_yang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110234#comment-16110234
 ] 

jifei_yang commented on SPARK-21601:


[https://github.com/apache/spark/pull/18807]

> Modify the JDK version of the Maven compilation
> ---
>
> Key: SPARK-21601
> URL: https://issues.apache.org/jira/browse/SPARK-21601
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 2.2.0
>Reporter: jifei_yang
>Priority: Minor
> Fix For: 2.2.0
>
>
> When using maven to compile spark, I want to add a modified jdk property. 
> This is user-friendly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-21601) Modify the JDK version of the Maven compilation

2017-08-01 Thread jifei_yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-21601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jifei_yang updated SPARK-21601:
---
Description: When using maven to compile spark, I want to add a modified 
jdk property. This is user-friendly.

> Modify the JDK version of the Maven compilation
> ---
>
> Key: SPARK-21601
> URL: https://issues.apache.org/jira/browse/SPARK-21601
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 2.2.0
>Reporter: jifei_yang
>Priority: Minor
> Fix For: 2.2.0
>
>
> When using maven to compile spark, I want to add a modified jdk property. 
> This is user-friendly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-21664) Use the column name as the file name.

2017-08-08 Thread jifei_yang (JIRA)
jifei_yang created SPARK-21664:
--

 Summary:  Use the column name as the file name.
 Key: SPARK-21664
 URL: https://issues.apache.org/jira/browse/SPARK-21664
 Project: Spark
  Issue Type: Improvement
  Components: Input/Output
Affects Versions: 2.2.0
Reporter: jifei_yang
 Fix For: 2.2.0


When we save the dataframe, we want to use the column name as the file name. 
PairRDDFunctions are achievable. Can Dataframe be implemented? Thank you.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21664) Use the column name as the file name.

2017-08-08 Thread jifei_yang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118003#comment-16118003
 ] 

jifei_yang commented on SPARK-21664:



If so, how to achieve it? thank you very much!

>  Use the column name as the file name.
> --
>
> Key: SPARK-21664
> URL: https://issues.apache.org/jira/browse/SPARK-21664
> Project: Spark
>  Issue Type: Improvement
>  Components: Input/Output
>Affects Versions: 2.2.0
>Reporter: jifei_yang
>
> When we save the dataframe, we want to use the column name as the file name. 
> PairRDDFunctions are achievable. Can Dataframe be implemented? Thank you.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-21664) Use the column name as the file name.

2017-08-08 Thread jifei_yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-21664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jifei_yang updated SPARK-21664:
---
Issue Type: Question  (was: Improvement)

>  Use the column name as the file name.
> --
>
> Key: SPARK-21664
> URL: https://issues.apache.org/jira/browse/SPARK-21664
> Project: Spark
>  Issue Type: Question
>  Components: Input/Output
>Affects Versions: 2.2.0
>Reporter: jifei_yang
>
> When we save the dataframe, we want to use the column name as the file name. 
> PairRDDFunctions are achievable. Can Dataframe be implemented? Thank you.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21664) Use the column name as the file name.

2017-08-08 Thread jifei_yang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118007#comment-16118007
 ] 

jifei_yang commented on SPARK-21664:


Thank you. I want to ask for help. How to achieve the name of the file as a 
file name preservation. Like PairRDDFunctions.

>  Use the column name as the file name.
> --
>
> Key: SPARK-21664
> URL: https://issues.apache.org/jira/browse/SPARK-21664
> Project: Spark
>  Issue Type: Question
>  Components: Input/Output
>Affects Versions: 2.2.0
>Reporter: jifei_yang
>
> When we save the dataframe, we want to use the column name as the file name. 
> PairRDDFunctions are achievable. Can Dataframe be implemented? Thank you.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23148) spark.read.csv with multiline=true gives FileNotFoundException if path contains spaces

2018-01-18 Thread jifei_yang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16331817#comment-16331817
 ] 

jifei_yang commented on SPARK-23148:


If you must do this, I think it is best to add escaping it.

> spark.read.csv with multiline=true gives FileNotFoundException if path 
> contains spaces
> --
>
> Key: SPARK-23148
> URL: https://issues.apache.org/jira/browse/SPARK-23148
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: Bogdan Raducanu
>Priority: Major
>
> Repro code:
> {code:java}
> spark.range(10).write.csv("/tmp/a b c/a.csv")
> spark.read.option("multiLine", false).csv("/tmp/a b c/a.csv").count
> 10
> spark.read.option("multiLine", true).csv("/tmp/a b c/a.csv").count
> java.io.FileNotFoundException: File 
> file:/tmp/a%20b%20c/a.csv/part-0-cf84f9b2-5fe6-4f54-a130-a1737689db00-c000.csv
>  does not exist
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21601) Modify the JDK version of the Maven compilation

2018-01-25 Thread jifei_yang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16340568#comment-16340568
 ] 

jifei_yang commented on SPARK-21601:


Thanks.

> Modify the JDK version of the Maven compilation
> ---
>
> Key: SPARK-21601
> URL: https://issues.apache.org/jira/browse/SPARK-21601
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 2.3.0
>Reporter: jifei_yang
>Priority: Minor
>
> When using maven to compile spark, I want to add a modified jdk property. 
> This is user-friendly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Closed] (SPARK-21664) Use the column name as the file name.

2018-01-29 Thread jifei_yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-21664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jifei_yang closed SPARK-21664.
--

We can use the partition to save the column names, such as:

{code:java}
case class UserInfo(name:String,favorite_number:Int,favorite_color:String) 
extends Serializable{}
def mainSaveAsParquet(args: Array[String]) {
val fileName=new Random().nextInt(43952858)
val outPath = 
s"G:/project/idea15/xlwl/bigdata002/bigdata/sparkmvn/outpath/user/spark/parquet/temp/$fileName"
val sparkConf = new SparkConf().setAppName("Spark Avro 
Test").setMaster("local[4]")

MyKryoRegistrator.register(sparkConf)

val sc = new SparkContext(sparkConf)

val sqlContext=new SQLContext(sc)

val array=new Array[UserInfo](3001)
for(i <- 0 to 3000){
  val choose=i % 2
  choose match {
case 0 =>array(i)=  UserInfo("Mozilla/5.0 (Windows NT 10.0; Win64; x64) 
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36", 
256+(i/102), "blue")
case 1 =>array(i)=  UserInfo("Mozilla/5.0 (Windows NT 10.0; Win64; x64) 
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36 
Edge/15.15063", 256+i, "blue")
  }
}

import sqlContext.implicits._
val records: DataFrame = sc.parallelize(array).toDF()

records.repartition(1).write.partitionBy("name","favorite_number").format("parquet").mode(SaveMode.ErrorIfExists).save(outPath)
sc.stop()
  }
{code}

This will handle the column name and favorite_number as input fields.

>  Use the column name as the file name.
> --
>
> Key: SPARK-21664
> URL: https://issues.apache.org/jira/browse/SPARK-21664
> Project: Spark
>  Issue Type: Question
>  Components: Input/Output
>Affects Versions: 2.2.0
>Reporter: jifei_yang
>Priority: Major
>
> When we save the dataframe, we want to use the column name as the file name. 
> PairRDDFunctions are achievable. Can Dataframe be implemented? Thank you.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org