[jira] [Commented] (SPARK-21629) OR nullability is incorrect

2017-08-05 Thread Liang-Chi Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115626#comment-16115626
 ] 

Liang-Chi Hsieh commented on SPARK-21629:
-

[~hvanhovell] As this is not a problem, but I can't close it. Can you help it? 
Thanks.

> OR nullability is incorrect
> ---
>
> Key: SPARK-21629
> URL: https://issues.apache.org/jira/browse/SPARK-21629
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.0, 2.1.1, 2.2.0
>Reporter: Herman van Hovell
>Assignee: Liang-Chi Hsieh
>Priority: Minor
>  Labels: Starter
>
> The SQL {{OR}} expression's nullability is slightly incorrect. It should only 
> be nullable when both of the input expressions are nullable, and not when 
> either of them is nullable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21588) SQLContext.getConf(key, null) should return null, but it throws NPE

2017-08-05 Thread Vinod KC (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115625#comment-16115625
 ] 

Vinod KC commented on SPARK-21588:
--

https://github.com/apache/spark/pull/18852

> SQLContext.getConf(key, null) should return null, but it throws NPE
> ---
>
> Key: SPARK-21588
> URL: https://issues.apache.org/jira/browse/SPARK-21588
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Burak Yavuz
>Priority: Minor
>
> SQLContext.get(key) for a key that is not defined in the conf, and doesn't 
> have a default value defined, throws a NoSuchElementException. In order to 
> avoid that, I used a null as the default value, which threw a NPE instead. If 
> it is null, it shouldn't try to parse the default value in `getConfString`



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21610) Corrupt records are not handled properly when creating a dataframe from a file

2017-08-05 Thread Liang-Chi Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115622#comment-16115622
 ] 

Liang-Chi Hsieh commented on SPARK-21610:
-

I'm mentoring one beginner working on this.

> Corrupt records are not handled properly when creating a dataframe from a file
> --
>
> Key: SPARK-21610
> URL: https://issues.apache.org/jira/browse/SPARK-21610
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.2, 2.2.0
> Environment: macOs Sierra 10.12.5
>Reporter: dmtran
>
> Consider a jsonl file with 3 records. The third record has a value of type 
> string, instead of int.
> {code}
> echo '{"field": 1}
> {"field": 2}
> {"field": "3"}' >/tmp/sample.json
> {code}
> Create a dataframe from this file, with a schema that contains 
> "_corrupt_record" so that corrupt records are kept.
> {code}
> import org.apache.spark.sql.types._
> val schema = new StructType()
>   .add("field", ByteType)
>   .add("_corrupt_record", StringType)
> val file = "/tmp/sample.json"
> val dfFromFile = spark.read.schema(schema).json(file)
> {code}
> Run the following lines from a spark-shell:
> {code}
> scala> dfFromFile.show(false)
> +-+---+
> |field|_corrupt_record|
> +-+---+
> |1|null   |
> |2|null   |
> |null |{"field": "3"} |
> +-+---+
> scala> dfFromFile.filter($"_corrupt_record".isNotNull).count()
> res1: Long = 0
> scala> dfFromFile.filter($"_corrupt_record".isNull).count()
> res2: Long = 3
> {code}
> The expected result is 1 corrupt record and 2 valid records, but the actual 
> one is 0 corrupt record and 3 valid records.
> The bug is not reproduced if we create a dataframe from a RDD:
> {code}
> scala> val rdd = sc.textFile(file)
> rdd: org.apache.spark.rdd.RDD[String] = /tmp/sample.json MapPartitionsRDD[92] 
> at textFile at :28
> scala> val dfFromRdd = spark.read.schema(schema).json(rdd)
> dfFromRdd: org.apache.spark.sql.DataFrame = [field: tinyint, _corrupt_record: 
> string]
> scala> dfFromRdd.show(false)
> +-+---+
> |field|_corrupt_record|
> +-+---+
> |1|null   |
> |2|null   |
> |null |{"field": "3"} |
> +-+---+
> scala> dfFromRdd.filter($"_corrupt_record".isNotNull).count()
> res5: Long = 1
> scala> dfFromRdd.filter($"_corrupt_record".isNull).count()
> res6: Long = 2
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-20963) Support column aliases for aliased relation in FROM clause

2017-08-05 Thread Xiao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-20963.
-
   Resolution: Fixed
 Assignee: Takeshi Yamamuro
Fix Version/s: 2.3.0

> Support column aliases for aliased relation in FROM clause
> --
>
> Key: SPARK-20963
> URL: https://issues.apache.org/jira/browse/SPARK-20963
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 2.1.1
>Reporter: Takeshi Yamamuro
>Assignee: Takeshi Yamamuro
> Fix For: 2.3.0
>
>
> Currently, we do not support column aliases for aliased relation;
> {code}
> scala> Seq((1, 2), (2, 0)).toDF("id", "value").createOrReplaceTempView("t1")
> scala> Seq((1, 2), (2, 0)).toDF("id", "value").createOrReplaceTempView("t2")
> scala> sql("SELECT * FROM (t1 JOIN t2)")
> scala> sql("SELECT * FROM (t1 INNER JOIN t2 ON t1.id = t2.id) AS t(a, b, c, 
> d)").show
> org.apache.spark.sql.catalyst.parser.ParseException:
> mismatched input '(' expecting {, ',', 'WHERE', 'GROUP', 'ORDER', 
> 'HAVING', 'LIMIT', 'JOIN', 'CROSS', 'INNER', 'LEFT', 'RIGHT', 'FULL', 
> 'NATURAL', 'LATERAL', 'WINDOW', 'UNION', 'EXCEPT', 'MINUS', 'INTERSECT', 
> 'SORT', 'CLUSTER', 'DISTRIBUTE', 'ANTI'}(line 1, pos 54)
> == SQL ==
> SELECT * FROM (t1 INNER JOIN t2 ON t1.id = t2.id) AS t(a, b, c, d)
> --^^^
>   at 
> org.apache.spark.sql.catalyst.parser.ParseException.withCommand(ParseDriver.scala:217)
>   at 
> org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:114)
>   at org.apache.spark.sql.execution.SparkSqlParser.parse(Spa
> {code}
> We could support this by referring;
> http://docs.aws.amazon.com/redshift/latest/dg/r_FROM_clause30.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-21637) `hive.metastore.warehouse` in --hiveconf is not respected

2017-08-05 Thread Xiao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-21637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-21637.
-
   Resolution: Fixed
 Assignee: Kent Yao
Fix Version/s: 2.3.0

> `hive.metastore.warehouse` in --hiveconf is not respected
> -
>
> Key: SPARK-21637
> URL: https://issues.apache.org/jira/browse/SPARK-21637
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Kent Yao
>Assignee: Kent Yao
>Priority: Minor
> Fix For: 2.3.0
>
>
> In CliSuite `hive.metastore.warehouse`  at line 
> https://github.com/apache/spark/blob/master/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala#L92
>  since this 
> commit:https://github.com/apache/spark/commit/8f33731e796750e6f60dc9e2fc33a94d29d198b4
>  . 
> ` bin/spark-sql --master local --hiveconf 
> hive.metastore.warehouse.dir=/some/dir` will not take affect, because now we 
> respect this property in hadoopconf only but in SparkSQLCliDriver, but we do 
> not add --hiveconf s to hadoopconf.
> {code:java}
> 17/08/04 15:46:53 INFO HiveClientImpl: Warehouse location for Hive client 
> (version 1.2.1) is 
> file:/home/hadoop/hzyaoqin/spark-2.2.0-bin-hadoop2.7/spark-warehouse
> spark-sql> set hive.metastore.warehouse.dir;
> 17/08/04 15:46:57 INFO SparkSqlParser: Parsing command: set 
> hive.metastore.warehouse.dir
> 17/08/04 15:47:00 INFO CodeGenerator: Code generated in 166.354926 ms
> hive.metastore.warehouse.dir  /some/dir
> Time taken: 2.154 seconds, Fetched 1 row(s)
> 17/08/04 15:47:00 INFO CliDriver: Time taken: 2.154 seconds, Fetched 1 row(s)
> spark-sql> set spark.sql.warehouse.dir;
> 17/08/04 15:47:13 INFO SparkSqlParser: Parsing command: set 
> spark.sql.warehouse.dir
> spark.sql.warehouse.dir   
> file:/home/hadoop/hzyaoqin/spark-2.2.0-bin-hadoop2.7/spark-warehouse
> Time taken: 0.024 seconds, Fetched 1 row(s)
> 17/08/04 15:47:13 INFO CliDriver: Time taken: 0.024 seconds, Fetched 1 row(s) 
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-21451) HiveConf in SparkSQLCLIDriver doesn't respect spark.hadoop.some.hive.variables

2017-08-05 Thread Xiao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-21451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-21451.
-
   Resolution: Fixed
 Assignee: Kent Yao
Fix Version/s: 2.3.0

> HiveConf in SparkSQLCLIDriver doesn't respect spark.hadoop.some.hive.variables
> --
>
> Key: SPARK-21451
> URL: https://issues.apache.org/jira/browse/SPARK-21451
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.3, 2.0.2, 2.1.0, 2.2.0
>Reporter: Kent Yao
>Assignee: Kent Yao
> Fix For: 2.3.0
>
>
> https://github.com/apache/spark/blob/master/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala#L83
>  is not copying properties configured for hadoop/hive by --conf or in 
> spak-default.conf using spark.hadoop.foo=bar



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21637) `hive.metastore.warehouse` in --hiveconf is not respected

2017-08-05 Thread Xiao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115558#comment-16115558
 ] 

Xiao Li commented on SPARK-21637:
-

https://github.com/apache/spark/pull/18668

> `hive.metastore.warehouse` in --hiveconf is not respected
> -
>
> Key: SPARK-21637
> URL: https://issues.apache.org/jira/browse/SPARK-21637
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Kent Yao
>Priority: Minor
>
> In CliSuite `hive.metastore.warehouse`  at line 
> https://github.com/apache/spark/blob/master/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala#L92
>  since this 
> commit:https://github.com/apache/spark/commit/8f33731e796750e6f60dc9e2fc33a94d29d198b4
>  . 
> ` bin/spark-sql --master local --hiveconf 
> hive.metastore.warehouse.dir=/some/dir` will not take affect, because now we 
> respect this property in hadoopconf only but in SparkSQLCliDriver, but we do 
> not add --hiveconf s to hadoopconf.
> {code:java}
> 17/08/04 15:46:53 INFO HiveClientImpl: Warehouse location for Hive client 
> (version 1.2.1) is 
> file:/home/hadoop/hzyaoqin/spark-2.2.0-bin-hadoop2.7/spark-warehouse
> spark-sql> set hive.metastore.warehouse.dir;
> 17/08/04 15:46:57 INFO SparkSqlParser: Parsing command: set 
> hive.metastore.warehouse.dir
> 17/08/04 15:47:00 INFO CodeGenerator: Code generated in 166.354926 ms
> hive.metastore.warehouse.dir  /some/dir
> Time taken: 2.154 seconds, Fetched 1 row(s)
> 17/08/04 15:47:00 INFO CliDriver: Time taken: 2.154 seconds, Fetched 1 row(s)
> spark-sql> set spark.sql.warehouse.dir;
> 17/08/04 15:47:13 INFO SparkSqlParser: Parsing command: set 
> spark.sql.warehouse.dir
> spark.sql.warehouse.dir   
> file:/home/hadoop/hzyaoqin/spark-2.2.0-bin-hadoop2.7/spark-warehouse
> Time taken: 0.024 seconds, Fetched 1 row(s)
> 17/08/04 15:47:13 INFO CliDriver: Time taken: 0.024 seconds, Fetched 1 row(s) 
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-21640) Method mode with String parameters within DataFrameWriter is error prone

2017-08-05 Thread Xiao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-21640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-21640.
-
   Resolution: Fixed
Fix Version/s: 2.3.0

> Method mode with String parameters within DataFrameWriter is error prone
> 
>
> Key: SPARK-21640
> URL: https://issues.apache.org/jira/browse/SPARK-21640
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Alberto
>Assignee: Alberto
>Priority: Trivial
> Fix For: 2.3.0
>
>
> The following method:
> {code:java}
> def mode(saveMode: String): DataFrameWriter[T]
> {code}
> sets the SaveMode of the DataFrameWriter depending on the string that is 
> pass-in as parameter.
> There is a java Enum with all the save modes which are Append, Overwrite, 
> ErrorIfExists and Ignore. In my current project I was writing some code that 
> was using this enum to get the string value that I use to call the mode 
> method:
> {code:java}
>   private[utils] val configModeAppend = SaveMode.Append.toString.toLowerCase
>   private[utils] val configModeErrorIfExists = 
> SaveMode.ErrorIfExists.toString.toLowerCase
>   private[utils] val configModeIgnore = SaveMode.Ignore.toString.toLowerCase
>   private[utils] val configModeOverwrite = 
> SaveMode.Overwrite.toString.toLowerCase
> {code}
> The configModeErrorIfExists val contains the value "errorifexists" and when I 
> call the saveMode method using this string it does not match. I suggest to 
> include "errorifexists" as a right match for the ErrorIfExists SaveMode.
> Will create a PR to address this issue ASAP.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-21640) Method mode with String parameters within DataFrameWriter is error prone

2017-08-05 Thread Xiao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-21640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li reassigned SPARK-21640:
---

Assignee: Alberto

> Method mode with String parameters within DataFrameWriter is error prone
> 
>
> Key: SPARK-21640
> URL: https://issues.apache.org/jira/browse/SPARK-21640
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Alberto
>Assignee: Alberto
>Priority: Trivial
> Fix For: 2.3.0
>
>
> The following method:
> {code:java}
> def mode(saveMode: String): DataFrameWriter[T]
> {code}
> sets the SaveMode of the DataFrameWriter depending on the string that is 
> pass-in as parameter.
> There is a java Enum with all the save modes which are Append, Overwrite, 
> ErrorIfExists and Ignore. In my current project I was writing some code that 
> was using this enum to get the string value that I use to call the mode 
> method:
> {code:java}
>   private[utils] val configModeAppend = SaveMode.Append.toString.toLowerCase
>   private[utils] val configModeErrorIfExists = 
> SaveMode.ErrorIfExists.toString.toLowerCase
>   private[utils] val configModeIgnore = SaveMode.Ignore.toString.toLowerCase
>   private[utils] val configModeOverwrite = 
> SaveMode.Overwrite.toString.toLowerCase
> {code}
> The configModeErrorIfExists val contains the value "errorifexists" and when I 
> call the saveMode method using this string it does not match. I suggest to 
> include "errorifexists" as a right match for the ErrorIfExists SaveMode.
> Will create a PR to address this issue ASAP.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-21616) SparkR 2.3.0 migration guide, release note

2017-08-05 Thread Sean Owen (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-21616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen updated SPARK-21616:
--
Fix Version/s: (was: 2.3.0)

> SparkR 2.3.0 migration guide, release note
> --
>
> Key: SPARK-21616
> URL: https://issues.apache.org/jira/browse/SPARK-21616
> Project: Spark
>  Issue Type: Documentation
>  Components: Documentation, SparkR
>Affects Versions: 2.3.0
>Reporter: Felix Cheung
>Assignee: Felix Cheung
>
> From looking at changes since 2.2.0, this/these should be documented in the 
> migration guide / release note for the 2.3.0 release, as it is behavior 
> changes



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21631) Building Spark with SBT unsuccessful when source code in Mllib is modified, But with MVN is ok

2017-08-05 Thread Sean Wong (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115448#comment-16115448
 ] 

Sean Wong commented on SPARK-21631:
---

So, how to set NOLINT_ON_COMPILE? in which file? Thanks.

> Building Spark with SBT unsuccessful when source code in Mllib is modified, 
> But with MVN is ok
> --
>
> Key: SPARK-21631
> URL: https://issues.apache.org/jira/browse/SPARK-21631
> Project: Spark
>  Issue Type: Bug
>  Components: Build, MLlib
>Affects Versions: 2.1.1
> Environment: ubuntu 14.04
> Spark 2.1.1
> MVN 3.3.9
> scala 2.11.8
>Reporter: Sean Wong
>
> I added 
> import org.apache.spark.internal.Logging
> at the head of LinearRegression.scala file
> Then, I try to build Spark using SBT.
> However, here is the error:
> *[info] Done packaging.
> java.lang.RuntimeException: errors exist
> at scala.sys.package$.error(package.scala:27)
> at org.scalastyle.sbt.Tasks$.onHasErrors$1(Plugin.scala:132)
> at 
> org.scalastyle.sbt.Tasks$.doScalastyleWithConfig$1(Plugin.scala:187)
> at org.scalastyle.sbt.Tasks$.doScalastyle(Plugin.scala:195)
> at 
> SparkBuild$$anonfun$cachedScalaStyle$1$$anonfun$17.apply(SparkBuild.scala:205)
> at 
> SparkBuild$$anonfun$cachedScalaStyle$1$$anonfun$17.apply(SparkBuild.scala:192)
> at sbt.FileFunction$$anonfun$cached$1.apply(Tracked.scala:235)
> at sbt.FileFunction$$anonfun$cached$1.apply(Tracked.scala:235)
> at 
> sbt.FileFunction$$anonfun$cached$2$$anonfun$apply$3$$anonfun$apply$4.apply(Tracked.scala:249)
> at 
> sbt.FileFunction$$anonfun$cached$2$$anonfun$apply$3$$anonfun$apply$4.apply(Tracked.scala:245)
> at sbt.Difference.apply(Tracked.scala:224)
> at sbt.Difference.apply(Tracked.scala:206)
> at 
> sbt.FileFunction$$anonfun$cached$2$$anonfun$apply$3.apply(Tracked.scala:245)
> at 
> sbt.FileFunction$$anonfun$cached$2$$anonfun$apply$3.apply(Tracked.scala:244)
> at sbt.Difference.apply(Tracked.scala:224)
> at sbt.Difference.apply(Tracked.scala:200)
> at sbt.FileFunction$$anonfun$cached$2.apply(Tracked.scala:244)
> at sbt.FileFunction$$anonfun$cached$2.apply(Tracked.scala:242)
> at SparkBuild$$anonfun$cachedScalaStyle$1.apply(SparkBuild.scala:212)
> at SparkBuild$$anonfun$cachedScalaStyle$1.apply(SparkBuild.scala:187)
> at scala.Function1$$anonfun$compose$1.apply(Function1.scala:47)
> at sbt.$tilde$greater$$anonfun$$u2219$1.apply(TypeFunctions.scala:40)
> at sbt.std.Transform$$anon$4.work(System.scala:63)
> at 
> sbt.Execute$$anonfun$submit$1$$anonfun$apply$1.apply(Execute.scala:228)
> at 
> sbt.Execute$$anonfun$submit$1$$anonfun$apply$1.apply(Execute.scala:228)
> at sbt.ErrorHandling$.wideConvert(ErrorHandling.scala:17)
> at sbt.Execute.work(Execute.scala:237)
> at sbt.Execute$$anonfun$submit$1.apply(Execute.scala:228)
> at sbt.Execute$$anonfun$submit$1.apply(Execute.scala:228)
> at 
> sbt.ConcurrentRestrictions$$anon$4$$anonfun$1.apply(ConcurrentRestrictions.scala:159)
> at sbt.CompletionService$$anon$2.call(CompletionService.scala:28)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> [error] (mllib/*:scalaStyleOnCompile) errors exist*
> After this, I switch to use MVN to build Spark, Everything is ok and the 
> building is successful.
> So is this a bug for SBT building? 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21629) OR nullability is incorrect

2017-08-05 Thread Liang-Chi Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115405#comment-16115405
 ] 

Liang-Chi Hsieh commented on SPARK-21629:
-

Yap, Thanks a lot [~hvanhovell].

> OR nullability is incorrect
> ---
>
> Key: SPARK-21629
> URL: https://issues.apache.org/jira/browse/SPARK-21629
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.0, 2.1.1, 2.2.0
>Reporter: Herman van Hovell
>Assignee: Liang-Chi Hsieh
>Priority: Minor
>  Labels: Starter
>
> The SQL {{OR}} expression's nullability is slightly incorrect. It should only 
> be nullable when both of the input expressions are nullable, and not when 
> either of them is nullable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-21629) OR nullability is incorrect

2017-08-05 Thread Liang-Chi Hsieh (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-21629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang-Chi Hsieh updated SPARK-21629:

Labels: Starter  (was: )

> OR nullability is incorrect
> ---
>
> Key: SPARK-21629
> URL: https://issues.apache.org/jira/browse/SPARK-21629
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.0, 2.1.1, 2.2.0
>Reporter: Herman van Hovell
>Priority: Minor
>  Labels: Starter
>
> The SQL {{OR}} expression's nullability is slightly incorrect. It should only 
> be nullable when both of the input expressions are nullable, and not when 
> either of them is nullable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21629) OR nullability is incorrect

2017-08-05 Thread Herman van Hovell (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115397#comment-16115397
 ] 

Herman van Hovell commented on SPARK-21629:
---

Sure go for it. Should be a one-liner :)

> OR nullability is incorrect
> ---
>
> Key: SPARK-21629
> URL: https://issues.apache.org/jira/browse/SPARK-21629
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.0, 2.1.1, 2.2.0
>Reporter: Herman van Hovell
>Priority: Minor
>  Labels: Starter
>
> The SQL {{OR}} expression's nullability is slightly incorrect. It should only 
> be nullable when both of the input expressions are nullable, and not when 
> either of them is nullable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-21629) OR nullability is incorrect

2017-08-05 Thread Herman van Hovell (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-21629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Herman van Hovell reassigned SPARK-21629:
-

Assignee: Liang-Chi Hsieh

> OR nullability is incorrect
> ---
>
> Key: SPARK-21629
> URL: https://issues.apache.org/jira/browse/SPARK-21629
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.0, 2.1.1, 2.2.0
>Reporter: Herman van Hovell
>Assignee: Liang-Chi Hsieh
>Priority: Minor
>  Labels: Starter
>
> The SQL {{OR}} expression's nullability is slightly incorrect. It should only 
> be nullable when both of the input expressions are nullable, and not when 
> either of them is nullable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21629) OR nullability is incorrect

2017-08-05 Thread Liang-Chi Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115395#comment-16115395
 ] 

Liang-Chi Hsieh commented on SPARK-21629:
-

[~hvanhovell] Sorry, I'm mentoring the few local developers in Taiwan for 
contributing Spark. This looks a trivial PR. May I ask holding this issue for 
one of the mentees? If you don't mind, can you help assign it to me 
temporarily. Thanks!

> OR nullability is incorrect
> ---
>
> Key: SPARK-21629
> URL: https://issues.apache.org/jira/browse/SPARK-21629
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.0, 2.1.1, 2.2.0
>Reporter: Herman van Hovell
>Priority: Minor
>  Labels: Starter
>
> The SQL {{OR}} expression's nullability is slightly incorrect. It should only 
> be nullable when both of the input expressions are nullable, and not when 
> either of them is nullable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-21629) OR nullability is incorrect

2017-08-05 Thread Liang-Chi Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115395#comment-16115395
 ] 

Liang-Chi Hsieh edited comment on SPARK-21629 at 8/5/17 1:12 PM:
-

[~hvanhovell] Sorry, I'm mentoring few local developers in Taiwan for 
contributing Spark. This looks a trivial PR. May I ask holding this issue for 
one of the mentees? If you don't mind, can you help assign it to me 
temporarily. Thanks!


was (Author: viirya):
[~hvanhovell] Sorry, I'm mentoring the few local developers in Taiwan for 
contributing Spark. This looks a trivial PR. May I ask holding this issue for 
one of the mentees? If you don't mind, can you help assign it to me 
temporarily. Thanks!

> OR nullability is incorrect
> ---
>
> Key: SPARK-21629
> URL: https://issues.apache.org/jira/browse/SPARK-21629
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.0, 2.1.1, 2.2.0
>Reporter: Herman van Hovell
>Priority: Minor
>  Labels: Starter
>
> The SQL {{OR}} expression's nullability is slightly incorrect. It should only 
> be nullable when both of the input expressions are nullable, and not when 
> either of them is nullable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21645) SparkSQL Left outer join get the error result when use phoenix spark plugin

2017-08-05 Thread shining (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115385#comment-16115385
 ] 

shining commented on SPARK-21645:
-

[~q79969786]
the physical plan by explain :
== Physical Plan ==
{quote}*Project [anchedate#402, womempnumdis#429, holdingsmsgdis#430]
+- SortMergeJoin [ANCHEID#400, S_EXT_NODENUM#426], [ANCHEID#428, 
S_EXT_NODENUM#431], LeftOuter
   :- *Sort [ANCHEID#400 ASC NULLS FIRST, S_EXT_NODENUM#426 ASC NULLS FIRST], 
false, 0
   :  +- Exchange hashpartitioning(ANCHEID#400, S_EXT_NODENUM#426, 200)
   : +- *Filter (isnotnull(ANCHEID#400) && (ANCHEID#400 = 
2c9e87ea5bd35458015c2df4003a1025))
   :+- *Scan PhoenixRelation(AN_BASEINFO,node1:2181,false) 
default.an_baseinfo[ANCHEID#400,ANCHEDATE#402,S_EXT_NODENUM#426] PushedFilters: 
[IsNotNull(ANCHEID), EqualTo(ANCHEID,2c9e87ea5bd35458015c2df4003a1025)], 
ReadSchema: struct
   +- *Sort [ANCHEID#428 ASC NULLS FIRST, S_EXT_NODENUM#431 ASC NULLS FIRST], 
false, 0
  +- Exchange hashpartitioning(ANCHEID#428, S_EXT_NODENUM#431, 200)
 +- *Scan PhoenixRelation(AN_SUP_BASEINFO,node1:2181,false) 
default.an_sup_baseinfo[ANCHEID#428,WOMEMPNUMDIS#429,HOLDINGSMSGDIS#430,S_EXT_NODENUM#431]
 ReadSchema: 
struct{quote}

> SparkSQL Left outer join get the error result when use phoenix spark plugin
> ---
>
> Key: SPARK-21645
> URL: https://issues.apache.org/jira/browse/SPARK-21645
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.0, 2.2.0
> Environment: spark2.1.0
> hbase 1.1.2
> phoenix4.10
>Reporter: shining
>
> I have two tables in phoenix: AN_BASEINFO and AN_SUP_BASEINFO 
> Then I crate the outer datasource table in sparksql through phoenix spark 
> plugin.like
> create table AN_BASEINFO 
> using org.apache.phoenix.spark
> OPTIONS(table "AN_BASEINFO ", zkUrl "172.16.12.82:2181")
> and 
> create table AN_SUP_BASEINFO 
> using org.apache.phoenix.spark
> OPTIONS(table "AN_SUP_BASEINFO ", zkUrl "172.16.12.82:2181")
> IN SparkSQL I execute a sql use lef outer join,the sql is :
> {color:red}{color:#f79232}_
> *select
> a.anchedate,b.womempnumdis,b.holdingsmsgdis
> from
> AN_BASEINFO a
>  left outer join AN_SUP_BASEINFO b
> on
>a.S_EXT_NODENUM = b.S_EXT_NODENUM and a.ANCHEID   =b.ANCHEID
> where
> a.ANCHEID= '2c9e87ea5bd35458015c2df4003a1025';*_{color}{color}
> the result is : 2017-05-22 00:00:00.0   NULLNULL 
> But actually, table AN_SUP_BASEINFO exist an record that  a.S_EXT_NODENUM = 
> b.S_EXT_NODENUM and a.ANCHEID   =b.ANCHEID;
> If I add a filter condition b.holdingsmsgdis is not null in the sql, the 
> result is right:
> 2017-05-22 00:00:00.0   2   1 
> the sql:
> *{color:#d04437}select
> a.anchedate,b.womempnumdis,b.holdingsmsgdis
> from
> AN_BASEINFO a
>  left outer join AN_SUP_BASEINFO b
> on
>a.S_EXT_NODENUM = b.S_EXT_NODENUM and a.ANCHEID   =b.ANCHEID
> where
> a.ANCHEID= '2c9e87ea5bd35458015c2df4003a1025'{color:#d04437}and 
> b.holdingsmsgdis is not null;{color}{color}*
> {color:#d04437}{color:#14892c}result is right: 2017-05-22 00:00:00.0   2  
>  1 {color}{color}
> Is there anyone who know this?Please help!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21645) SparkSQL Left outer join get the error result when use phoenix spark plugin

2017-08-05 Thread Yuming Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115365#comment-16115365
 ] 

Yuming Wang commented on SPARK-21645:
-

Can you paste the SQL execution plan:
{code:sql}
EXPLAIN SELECT a.anchedate, b.womempnumdis, b.holdingsmsgdis
FROM AN_BASEINFO a
  LEFT JOIN AN_SUP_BASEINFO b ON a.S_EXT_NODENUM = b.S_EXT_NODENUM
AND a.ANCHEID = b.ANCHEID
WHERE a.ANCHEID = '2c9e87ea5bd35458015c2df4003a1025';
{code}

> SparkSQL Left outer join get the error result when use phoenix spark plugin
> ---
>
> Key: SPARK-21645
> URL: https://issues.apache.org/jira/browse/SPARK-21645
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.0, 2.2.0
> Environment: spark2.1.0
> hbase 1.1.2
> phoenix4.10
>Reporter: shining
>
> I have two tables in phoenix: AN_BASEINFO and AN_SUP_BASEINFO 
> Then I crate the outer datasource table in sparksql through phoenix spark 
> plugin.like
> create table AN_BASEINFO 
> using org.apache.phoenix.spark
> OPTIONS(table "AN_BASEINFO ", zkUrl "172.16.12.82:2181")
> and 
> create table AN_SUP_BASEINFO 
> using org.apache.phoenix.spark
> OPTIONS(table "AN_SUP_BASEINFO ", zkUrl "172.16.12.82:2181")
> IN SparkSQL I execute a sql use lef outer join,the sql is :
> {color:red}{color:#f79232}_
> *select
> a.anchedate,b.womempnumdis,b.holdingsmsgdis
> from
> AN_BASEINFO a
>  left outer join AN_SUP_BASEINFO b
> on
>a.S_EXT_NODENUM = b.S_EXT_NODENUM and a.ANCHEID   =b.ANCHEID
> where
> a.ANCHEID= '2c9e87ea5bd35458015c2df4003a1025';*_{color}{color}
> the result is : 2017-05-22 00:00:00.0   NULLNULL 
> But actually, table AN_SUP_BASEINFO exist an record that  a.S_EXT_NODENUM = 
> b.S_EXT_NODENUM and a.ANCHEID   =b.ANCHEID;
> If I add a filter condition b.holdingsmsgdis is not null in the sql, the 
> result is right:
> 2017-05-22 00:00:00.0   2   1 
> the sql:
> *{color:#d04437}select
> a.anchedate,b.womempnumdis,b.holdingsmsgdis
> from
> AN_BASEINFO a
>  left outer join AN_SUP_BASEINFO b
> on
>a.S_EXT_NODENUM = b.S_EXT_NODENUM and a.ANCHEID   =b.ANCHEID
> where
> a.ANCHEID= '2c9e87ea5bd35458015c2df4003a1025'{color:#d04437}and 
> b.holdingsmsgdis is not null;{color}{color}*
> {color:#d04437}{color:#14892c}result is right: 2017-05-22 00:00:00.0   2  
>  1 {color}{color}
> Is there anyone who know this?Please help!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21646) BinaryComparison shouldn't auto cast string to int/long

2017-08-05 Thread Yuming Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115350#comment-16115350
 ] 

Yuming Wang commented on SPARK-21646:
-

User 'wangyum' has created a pull request for this issue:
https://github.com/apache/spark/pull/18853

> BinaryComparison shouldn't auto cast string to int/long
> ---
>
> Key: SPARK-21646
> URL: https://issues.apache.org/jira/browse/SPARK-21646
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Yuming Wang
>
> How to reproduce:
> hive:
> {code:sql}
> $ hive -S
> hive> create table spark_21646(c1 string, c2 string);
> hive> insert into spark_21646 values('92233720368547758071', 'a');
> hive> insert into spark_21646 values('21474836471', 'b');
> hive> insert into spark_21646 values('10', 'c');
> hive> select * from spark_21646 where c1 > 0;
> 92233720368547758071  a
> 10c
> 21474836471   b
> hive>
> {code}
> spark-sql:
> {code:sql}
> $ spark-sql -S
> spark-sql> select * from spark_21646 where c1 > 0;
> 10  c 
>   
> spark-sql> select * from spark_21646 where c1 > 0L;
> 21474836471   b
> 10c
> spark-sql> explain select * from spark_21646 where c1 > 0;
> == Physical Plan ==
> *Project [c1#14, c2#15]
> +- *Filter (isnotnull(c1#14) && (cast(c1#14 as int) > 0))
>+- *FileScan parquet spark_21646[c1#14,c2#15] Batched: true, Format: 
> Parquet, Location: 
> InMemoryFileIndex[viewfs://cluster4/user/hive/warehouse/spark_21646], 
> PartitionFilters: [], PushedFilters: [IsNotNull(c1)], ReadSchema: 
> struct
> spark-sql> 
> {code}
> As you can see, spark auto cast c1 to int type, if this value out of integer 
> range, the result is different from Hive.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Issue Comment Deleted] (SPARK-21646) BinaryComparison shouldn't auto cast string to int/long

2017-08-05 Thread Yuming Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-21646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuming Wang updated SPARK-21646:

Comment: was deleted

(was: I'll create a PR later)

> BinaryComparison shouldn't auto cast string to int/long
> ---
>
> Key: SPARK-21646
> URL: https://issues.apache.org/jira/browse/SPARK-21646
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Yuming Wang
>
> How to reproduce:
> hive:
> {code:sql}
> $ hive -S
> hive> create table spark_21646(c1 string, c2 string);
> hive> insert into spark_21646 values('92233720368547758071', 'a');
> hive> insert into spark_21646 values('21474836471', 'b');
> hive> insert into spark_21646 values('10', 'c');
> hive> select * from spark_21646 where c1 > 0;
> 92233720368547758071  a
> 10c
> 21474836471   b
> hive>
> {code}
> spark-sql:
> {code:sql}
> $ spark-sql -S
> spark-sql> select * from spark_21646 where c1 > 0;
> 10  c 
>   
> spark-sql> select * from spark_21646 where c1 > 0L;
> 21474836471   b
> 10c
> spark-sql> explain select * from spark_21646 where c1 > 0;
> == Physical Plan ==
> *Project [c1#14, c2#15]
> +- *Filter (isnotnull(c1#14) && (cast(c1#14 as int) > 0))
>+- *FileScan parquet spark_21646[c1#14,c2#15] Batched: true, Format: 
> Parquet, Location: 
> InMemoryFileIndex[viewfs://cluster4/user/hive/warehouse/spark_21646], 
> PartitionFilters: [], PushedFilters: [IsNotNull(c1)], ReadSchema: 
> struct
> spark-sql> 
> {code}
> As you can see, spark auto cast c1 to int type, if this value out of integer 
> range, the result is different from Hive.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-21646) BinaryComparison shouldn't auto cast string to int/long

2017-08-05 Thread Yuming Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-21646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuming Wang updated SPARK-21646:

Description: 
How to reproduce:
hive:
{code:sql}
$ hive -S
hive> create table spark_21646(c1 string, c2 string);
hive> insert into spark_21646 values('92233720368547758071', 'a');
hive> insert into spark_21646 values('21474836471', 'b');
hive> insert into spark_21646 values('10', 'c');
hive> select * from spark_21646 where c1 > 0;
92233720368547758071a
10  c
21474836471 b
hive>
{code}

spark-sql:
{code:sql}
$ spark-sql -S
spark-sql> select * from spark_21646 where c1 > 0;
10  c   
spark-sql> select * from spark_21646 where c1 > 0L;
21474836471 b
10  c
spark-sql> explain select * from spark_21646 where c1 > 0;
== Physical Plan ==
*Project [c1#14, c2#15]
+- *Filter (isnotnull(c1#14) && (cast(c1#14 as int) > 0))
   +- *FileScan parquet spark_21646[c1#14,c2#15] Batched: true, Format: 
Parquet, Location: 
InMemoryFileIndex[viewfs://cluster4/user/hive/warehouse/spark_21646], 
PartitionFilters: [], PushedFilters: [IsNotNull(c1)], ReadSchema: 
struct
spark-sql> 
{code}

As you can see, spark auto cast c1 to int type, if this value out of integer 
range, the result is different from Hive.

  was:
How to reproduce:
Hive:
{code:sql}
$ hive -S
hive> create table spark_21646(c1 string, c2 string);
hive> insert into spark_21646 values('92233720368547758071', 'a');
hive> insert into spark_21646 values('21474836471', 'b');
hive> insert into spark_21646 values('10', 'c');
hive> select * from spark_21646 where c1 > 0;
92233720368547758071a
10  c
21474836471 b
hive>
{code}

{code:sql}
$ spark-sql -S
spark-sql> select * from spark_21646 where c1 > 0;
10  c   
spark-sql> select * from spark_21646 where c1 > 0L;
21474836471 b
10  c
spark-sql> explain select * from spark_21646 where c1 > 0;
== Physical Plan ==
*Project [c1#14, c2#15]
+- *Filter (isnotnull(c1#14) && (cast(c1#14 as int) > 0))
   +- *FileScan parquet spark_21646[c1#14,c2#15] Batched: true, Format: 
Parquet, Location: 
InMemoryFileIndex[viewfs://cluster4/user/hive/warehouse/spark_21646], 
PartitionFilters: [], PushedFilters: [IsNotNull(c1)], ReadSchema: 
struct
spark-sql> 
{code}

As you can see, spark auto cast c1 to int type, if this value out of integer 
range, the result is different from Hive.


> BinaryComparison shouldn't auto cast string to int/long
> ---
>
> Key: SPARK-21646
> URL: https://issues.apache.org/jira/browse/SPARK-21646
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Yuming Wang
>
> How to reproduce:
> hive:
> {code:sql}
> $ hive -S
> hive> create table spark_21646(c1 string, c2 string);
> hive> insert into spark_21646 values('92233720368547758071', 'a');
> hive> insert into spark_21646 values('21474836471', 'b');
> hive> insert into spark_21646 values('10', 'c');
> hive> select * from spark_21646 where c1 > 0;
> 92233720368547758071  a
> 10c
> 21474836471   b
> hive>
> {code}
> spark-sql:
> {code:sql}
> $ spark-sql -S
> spark-sql> select * from spark_21646 where c1 > 0;
> 10  c 
>   
> spark-sql> select * from spark_21646 where c1 > 0L;
> 21474836471   b
> 10c
> spark-sql> explain select * from spark_21646 where c1 > 0;
> == Physical Plan ==
> *Project [c1#14, c2#15]
> +- *Filter (isnotnull(c1#14) && (cast(c1#14 as int) > 0))
>+- *FileScan parquet spark_21646[c1#14,c2#15] Batched: true, Format: 
> Parquet, Location: 
> InMemoryFileIndex[viewfs://cluster4/user/hive/warehouse/spark_21646], 
> PartitionFilters: [], PushedFilters: [IsNotNull(c1)], ReadSchema: 
> struct
> spark-sql> 
> {code}
> As you can see, spark auto cast c1 to int type, if this value out of integer 
> range, the result is different from Hive.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-21646) BinaryComparison shouldn't auto cast string to int/long

2017-08-05 Thread Yuming Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-21646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuming Wang updated SPARK-21646:

Description: 
How to reproduce:
Hive:
{code:sql}
$ hive -S
hive> create table spark_21646(c1 string, c2 string);
hive> insert into spark_21646 values('92233720368547758071', 'a');
hive> insert into spark_21646 values('21474836471', 'b');
hive> insert into spark_21646 values('10', 'c');
hive> select * from spark_21646 where c1 > 0;
92233720368547758071a
10  c
21474836471 b
hive>
{code}

{code:sql}
$ spark-sql -S
spark-sql> select * from spark_21646 where c1 > 0;
10  c   
spark-sql> select * from spark_21646 where c1 > 0L;
21474836471 b
10  c
spark-sql> explain select * from spark_21646 where c1 > 0;
== Physical Plan ==
*Project [c1#14, c2#15]
+- *Filter (isnotnull(c1#14) && (cast(c1#14 as int) > 0))
   +- *FileScan parquet spark_21646[c1#14,c2#15] Batched: true, Format: 
Parquet, Location: 
InMemoryFileIndex[viewfs://cluster4/user/hive/warehouse/spark_21646], 
PartitionFilters: [], PushedFilters: [IsNotNull(c1)], ReadSchema: 
struct
spark-sql> 
{code}

As you can see, spark auto cast c1 to int type, if this value out of integer 
range, the result is different from Hive.

  was:
Hive:
{code:sql}
$ hive -S
hive> create table tmp.wym_spark_123(c1 string, c2 string);
hive> insert into tmp.wym_spark_123 values('92233720368547758071', 'a');
hive> insert into tmp.wym_spark_123 values('21474836471', 'b');
hive> insert into tmp.wym_spark_123 values('10', 'c');
hive> select * from tmp.wym_spark_123 where c1 > 0;
92233720368547758071a
10  c
21474836471 b
hive>
{code}

{code:sql}
$ spark-sql -S
spark-sql> select * from tmp.wym_spark_123 where c1 > 0;
10  c   
spark-sql> select * from tmp.wym_spark_123 where c1 > 0L;
21474836471 b
10  c
spark-sql> explain select * from tmp.wym_spark_123 where c1 > 0;
== Physical Plan ==
*Project [c1#14, c2#15]
+- *Filter (isnotnull(c1#14) && (cast(c1#14 as int) > 0))
   +- *FileScan parquet tmp.wym_spark_123[c1#14,c2#15] Batched: true, Format: 
Parquet, Location: 
InMemoryFileIndex[viewfs://cluster4/user/hive/warehouse/tmp.db/wym_spark_123], 
PartitionFilters: [], PushedFilters: [IsNotNull(c1)], ReadSchema: 
struct
spark-sql> 
{code}


> BinaryComparison shouldn't auto cast string to int/long
> ---
>
> Key: SPARK-21646
> URL: https://issues.apache.org/jira/browse/SPARK-21646
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Yuming Wang
>
> How to reproduce:
> Hive:
> {code:sql}
> $ hive -S
> hive> create table spark_21646(c1 string, c2 string);
> hive> insert into spark_21646 values('92233720368547758071', 'a');
> hive> insert into spark_21646 values('21474836471', 'b');
> hive> insert into spark_21646 values('10', 'c');
> hive> select * from spark_21646 where c1 > 0;
> 92233720368547758071  a
> 10c
> 21474836471   b
> hive>
> {code}
> {code:sql}
> $ spark-sql -S
> spark-sql> select * from spark_21646 where c1 > 0;
> 10  c 
>   
> spark-sql> select * from spark_21646 where c1 > 0L;
> 21474836471   b
> 10c
> spark-sql> explain select * from spark_21646 where c1 > 0;
> == Physical Plan ==
> *Project [c1#14, c2#15]
> +- *Filter (isnotnull(c1#14) && (cast(c1#14 as int) > 0))
>+- *FileScan parquet spark_21646[c1#14,c2#15] Batched: true, Format: 
> Parquet, Location: 
> InMemoryFileIndex[viewfs://cluster4/user/hive/warehouse/spark_21646], 
> PartitionFilters: [], PushedFilters: [IsNotNull(c1)], ReadSchema: 
> struct
> spark-sql> 
> {code}
> As you can see, spark auto cast c1 to int type, if this value out of integer 
> range, the result is different from Hive.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-21609) In the Master ui add "log directory" display, is conducive to users to quickly find the log directory path.

2017-08-05 Thread Sean Owen (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-21609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen resolved SPARK-21609.
---
Resolution: Not A Problem

> In the Master ui add "log directory" display, is conducive to users to 
> quickly find the log directory path.
> ---
>
> Key: SPARK-21609
> URL: https://issues.apache.org/jira/browse/SPARK-21609
> Project: Spark
>  Issue Type: Improvement
>  Components: Web UI
>Affects Versions: 2.3.0
>Reporter: guoxiaolongzte
>Priority: Minor
>
> In the Master ui add "log directory" display, is conducive to users to 
> quickly find the log directory path.
> In the spark application development process, we not only view the executor 
> log and driver log, but also to see the master log and worker log, but the 
> current UI will not show the master and worker log path, resulting in the 
> user is not very clear to find the log path. So, I add "log directory" 
> display.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21646) BinaryComparison shouldn't auto cast string to int/long

2017-08-05 Thread Yuming Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115345#comment-16115345
 ] 

Yuming Wang commented on SPARK-21646:
-

I'll create a PR later

> BinaryComparison shouldn't auto cast string to int/long
> ---
>
> Key: SPARK-21646
> URL: https://issues.apache.org/jira/browse/SPARK-21646
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Yuming Wang
>
> Hive:
> {code:sql}
> $ hive -S
> hive> create table tmp.wym_spark_123(c1 string, c2 string);
> hive> insert into tmp.wym_spark_123 values('92233720368547758071', 'a');
> hive> insert into tmp.wym_spark_123 values('21474836471', 'b');
> hive> insert into tmp.wym_spark_123 values('10', 'c');
> hive> select * from tmp.wym_spark_123 where c1 > 0;
> 92233720368547758071  a
> 10c
> 21474836471   b
> hive>
> {code}
> {code:sql}
> $ spark-sql -S
> spark-sql> select * from tmp.wym_spark_123 where c1 > 0;
> 10  c 
>   
> spark-sql> select * from tmp.wym_spark_123 where c1 > 0L;
> 21474836471   b
> 10c
> spark-sql> explain select * from tmp.wym_spark_123 where c1 > 0;
> == Physical Plan ==
> *Project [c1#14, c2#15]
> +- *Filter (isnotnull(c1#14) && (cast(c1#14 as int) > 0))
>+- *FileScan parquet tmp.wym_spark_123[c1#14,c2#15] Batched: true, Format: 
> Parquet, Location: 
> InMemoryFileIndex[viewfs://cluster4/user/hive/warehouse/tmp.db/wym_spark_123],
>  PartitionFilters: [], PushedFilters: [IsNotNull(c1)], ReadSchema: 
> struct
> spark-sql> 
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Closed] (SPARK-21643) LR dataset worked in Spark 1.6.3, 2.0.2 stopped working in 2.1.0 onward

2017-08-05 Thread Sean Owen (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-21643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen closed SPARK-21643.
-

> LR dataset worked in Spark 1.6.3, 2.0.2 stopped working in 2.1.0 onward
> ---
>
> Key: SPARK-21643
> URL: https://issues.apache.org/jira/browse/SPARK-21643
> Project: Spark
>  Issue Type: Bug
>  Components: ML
>Affects Versions: 2.1.0, 2.1.1, 2.2.0
> Environment: CentOS 7, 256G memory, and 52 CPUs VM
>Reporter: Thomas Kwan
>
> This dataset is working on 1.6.x, and 2.0.x. But it is not converging with 
> 2.1+
> a) Download the data set 
> (https://s3.amazonaws.com/manage-partners/pipeline/di873-train.json.gz) and 
> uncompress it, i placed it /tmp/di873-train.json
> b) Download the spark package to /usr/lib/spark/spark-*
> c) cd sbin
> d) start-master.sh
> e) start-slave.sh 
> f) cd ../bin
> g) Start spark-shell 
> h) I pasted in the following scala cods:
> import org.apache.spark.sql.types._
> val VT = org.apache.spark.ml.linalg.SQLDataTypes.VectorType
> val schema = StructType(Array(StructField("features", 
> VT,true),StructField("label",DoubleType,true)))
> val df = spark.read.schema(schema).json("file:///tmp/di873-train.json")
> val trainer = new 
> org.apache.spark.ml.classification.LogisticRegression().setMaxIter(500).setElasticNetParam(1.0).setRegParam(0.1).setTol(0.1).setFitIntercept(true)
> val model = trainer.fit(df)
> i) Then I monitored the progress in the Spark UI under the Jobs tab.
> With Spark 1.6.1, Spark 2.0.2, the training (treeAggregate), the training 
> finished around 25-30 jobs. But with 2.1+, the trainings were not converging 
> and the training were finished only because they hitted the max iterations 
> (i.e. 500).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-21646) BinaryComparison shouldn't auto cast string to int/long

2017-08-05 Thread Yuming Wang (JIRA)
Yuming Wang created SPARK-21646:
---

 Summary: BinaryComparison shouldn't auto cast string to int/long
 Key: SPARK-21646
 URL: https://issues.apache.org/jira/browse/SPARK-21646
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.2.0
Reporter: Yuming Wang


Hive:
{code:sql}
$ hive -S
hive> create table tmp.wym_spark_123(c1 string, c2 string);
hive> insert into tmp.wym_spark_123 values('92233720368547758071', 'a');
hive> insert into tmp.wym_spark_123 values('21474836471', 'b');
hive> insert into tmp.wym_spark_123 values('10', 'c');
hive> select * from tmp.wym_spark_123 where c1 > 0;
92233720368547758071a
10  c
21474836471 b
hive>
{code}

{code:sql}
$ spark-sql -S
spark-sql> select * from tmp.wym_spark_123 where c1 > 0;
10  c   
spark-sql> select * from tmp.wym_spark_123 where c1 > 0L;
21474836471 b
10  c
spark-sql> explain select * from tmp.wym_spark_123 where c1 > 0;
== Physical Plan ==
*Project [c1#14, c2#15]
+- *Filter (isnotnull(c1#14) && (cast(c1#14 as int) > 0))
   +- *FileScan parquet tmp.wym_spark_123[c1#14,c2#15] Batched: true, Format: 
Parquet, Location: 
InMemoryFileIndex[viewfs://cluster4/user/hive/warehouse/tmp.db/wym_spark_123], 
PartitionFilters: [], PushedFilters: [IsNotNull(c1)], ReadSchema: 
struct
spark-sql> 
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-21643) LR dataset worked in Spark 1.6.3, 2.0.2 stopped working in 2.1.0 onward

2017-08-05 Thread Sean Owen (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-21643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen resolved SPARK-21643.
---
Resolution: Invalid

This isn't narrowed down nearly enough to be a JIRA. It's not even clear 
there's a problem as you just get a different number of iterations.

> LR dataset worked in Spark 1.6.3, 2.0.2 stopped working in 2.1.0 onward
> ---
>
> Key: SPARK-21643
> URL: https://issues.apache.org/jira/browse/SPARK-21643
> Project: Spark
>  Issue Type: Bug
>  Components: ML
>Affects Versions: 2.1.0, 2.1.1, 2.2.0
> Environment: CentOS 7, 256G memory, and 52 CPUs VM
>Reporter: Thomas Kwan
>
> This dataset is working on 1.6.x, and 2.0.x. But it is not converging with 
> 2.1+
> a) Download the data set 
> (https://s3.amazonaws.com/manage-partners/pipeline/di873-train.json.gz) and 
> uncompress it, i placed it /tmp/di873-train.json
> b) Download the spark package to /usr/lib/spark/spark-*
> c) cd sbin
> d) start-master.sh
> e) start-slave.sh 
> f) cd ../bin
> g) Start spark-shell 
> h) I pasted in the following scala cods:
> import org.apache.spark.sql.types._
> val VT = org.apache.spark.ml.linalg.SQLDataTypes.VectorType
> val schema = StructType(Array(StructField("features", 
> VT,true),StructField("label",DoubleType,true)))
> val df = spark.read.schema(schema).json("file:///tmp/di873-train.json")
> val trainer = new 
> org.apache.spark.ml.classification.LogisticRegression().setMaxIter(500).setElasticNetParam(1.0).setRegParam(0.1).setTol(0.1).setFitIntercept(true)
> val model = trainer.fit(df)
> i) Then I monitored the progress in the Spark UI under the Jobs tab.
> With Spark 1.6.1, Spark 2.0.2, the training (treeAggregate), the training 
> finished around 25-30 jobs. But with 2.1+, the trainings were not converging 
> and the training were finished only because they hitted the max iterations 
> (i.e. 500).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-21645) SparkSQL Left outer join get the error result when use phoenix spark plugin

2017-08-05 Thread shining (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-21645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shining updated SPARK-21645:

Description: 
I have two tables in phoenix: AN_BASEINFO and AN_SUP_BASEINFO 
Then I crate the outer datasource table in sparksql through phoenix spark 
plugin.like

create table AN_BASEINFO 
using org.apache.phoenix.spark
OPTIONS(table "AN_BASEINFO ", zkUrl "172.16.12.82:2181")
and 
create table AN_SUP_BASEINFO 
using org.apache.phoenix.spark
OPTIONS(table "AN_SUP_BASEINFO ", zkUrl "172.16.12.82:2181")

IN SparkSQL I execute a sql use lef outer join,the sql is :
{color:red}{color:#f79232}_
*select
a.anchedate,b.womempnumdis,b.holdingsmsgdis
from
AN_BASEINFO a
 left outer join AN_SUP_BASEINFO b
on
   a.S_EXT_NODENUM = b.S_EXT_NODENUM and a.ANCHEID   =b.ANCHEID
where
a.ANCHEID= '2c9e87ea5bd35458015c2df4003a1025';*_{color}{color}

the result is : 2017-05-22 00:00:00.0   NULLNULL 

But actually, table AN_SUP_BASEINFO exist an record that  a.S_EXT_NODENUM = 
b.S_EXT_NODENUM and a.ANCHEID   =b.ANCHEID;
If I add a filter condition b.holdingsmsgdis is not null in the sql, the result 
is right:
2017-05-22 00:00:00.0   2   1 
the sql:
*{color:#d04437}select
a.anchedate,b.womempnumdis,b.holdingsmsgdis
from
AN_BASEINFO a
 left outer join AN_SUP_BASEINFO b
on
   a.S_EXT_NODENUM = b.S_EXT_NODENUM and a.ANCHEID   =b.ANCHEID
where
a.ANCHEID= '2c9e87ea5bd35458015c2df4003a1025'{color:#d04437}and 
b.holdingsmsgdis is not null;{color}{color}*
{color:#d04437}{color:#14892c}result is right: 2017-05-22 00:00:00.0   2   
1 {color}{color}

Is there anyone who know this?Please help!


  was:
I have two tables in phoenix: AN_BASEINFO and AN_SUP_BASEINFO 
Then I crate the outer datasource table in sparksql through phoenix spark 
plugin.like

create table AN_BASEINFO 
using org.apache.phoenix.spark
OPTIONS(table "AN_BASEINFO ", zkUrl "172.16.12.82:2181")
and 
create table AN_SUP_BASEINFO 
using org.apache.phoenix.spark
OPTIONS(table "AN_SUP_BASEINFO ", zkUrl "172.16.12.82:2181")

IN SparkSQL I execute a sql use lef outer join,the sql is :
{color:red}{color:#f79232}_
*select
a.anchedate,b.womempnumdis,b.holdingsmsgdis
from
AN_BASEINFO a
 left outer join AN_SUP_BASEINFO b
on
   a.S_EXT_NODENUM = b.S_EXT_NODENUM and a.ANCHEID   =b.ANCHEID
where
a.ANCHEID= '2c9e87ea5bd35458015c2df4003a1025';*_{color}{color}

the result is : 2017-05-22 00:00:00.0   NULLNULL 

But actually, table AN_SUP_BASEINFO exist an record that  a.S_EXT_NODENUM = 
b.S_EXT_NODENUM and a.ANCHEID   =b.ANCHEID;
If I add a filter condition b.holdingsmsgdis is not null in the sql, the result 
is right:
2017-05-22 00:00:00.0   2   1 
the sql:
select
a.anchedate,b.womempnumdis,b.holdingsmsgdis
from
AN_BASEINFO a
 left outer join AN_SUP_BASEINFO b
on
   a.S_EXT_NODENUM = b.S_EXT_NODENUM and a.ANCHEID   =b.ANCHEID
where
a.ANCHEID= '2c9e87ea5bd35458015c2df4003a1025'{color:#d04437}and 
b.holdingsmsgdis is not null;{color}
result is right: 2017-05-22 00:00:00.0   2   1 

Is there anyone who know this?Please help!



> SparkSQL Left outer join get the error result when use phoenix spark plugin
> ---
>
> Key: SPARK-21645
> URL: https://issues.apache.org/jira/browse/SPARK-21645
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.0, 2.2.0
> Environment: spark2.1.0
> hbase 1.1.2
> phoenix4.10
>Reporter: shining
>
> I have two tables in phoenix: AN_BASEINFO and AN_SUP_BASEINFO 
> Then I crate the outer datasource table in sparksql through phoenix spark 
> plugin.like
> create table AN_BASEINFO 
> using org.apache.phoenix.spark
> OPTIONS(table "AN_BASEINFO ", zkUrl "172.16.12.82:2181")
> and 
> create table AN_SUP_BASEINFO 
> using org.apache.phoenix.spark
> OPTIONS(table "AN_SUP_BASEINFO ", zkUrl "172.16.12.82:2181")
> IN SparkSQL I execute a sql use lef outer join,the sql is :
> {color:red}{color:#f79232}_
> *select
> a.anchedate,b.womempnumdis,b.holdingsmsgdis
> from
> AN_BASEINFO a
>  left outer join AN_SUP_BASEINFO b
> on
>a.S_EXT_NODENUM = b.S_EXT_NODENUM and a.ANCHEID   =b.ANCHEID
> where
> a.ANCHEID= '2c9e87ea5bd35458015c2df4003a1025';*_{color}{color}
> the result is : 2017-05-22 00:00:00.0   NULLNULL 
> But actually, table AN_SUP_BASEINFO exist an record that  a.S_EXT_NODENUM = 
> b.S_EXT_NODENUM and a.ANCHEID   =b.ANCHEID;
> If I add a filter condition b.holdingsmsgdis is not null in the sql, the 
> result is right:
> 2017-05-22 00:00:00.0   2   1 
> the sql:
> *{color:#d04437}select
> a.anchedate,b.womempnumdis,b.holdingsmsgdis
> from
> AN_BASEINFO a
>  left outer join 

[jira] [Created] (SPARK-21645) SparkSQL Left outer join get the error result when use phoenix spark plugin

2017-08-05 Thread shining (JIRA)
shining created SPARK-21645:
---

 Summary: SparkSQL Left outer join get the error result when use 
phoenix spark plugin
 Key: SPARK-21645
 URL: https://issues.apache.org/jira/browse/SPARK-21645
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.2.0, 2.1.0
 Environment: spark2.1.0
hbase 1.1.2
phoenix4.10
Reporter: shining


I have two tables in phoenix: AN_BASEINFO and AN_SUP_BASEINFO 
Then I crate the outer datasource table in sparksql through phoenix spark 
plugin.like

create table AN_BASEINFO 
using org.apache.phoenix.spark
OPTIONS(table "AN_BASEINFO ", zkUrl "172.16.12.82:2181")
and 
create table AN_SUP_BASEINFO 
using org.apache.phoenix.spark
OPTIONS(table "AN_SUP_BASEINFO ", zkUrl "172.16.12.82:2181")

IN SparkSQL I execute a sql use lef outer join,the sql is :
{color:red}{color:#f79232}_
*select
a.anchedate,b.womempnumdis,b.holdingsmsgdis
from
AN_BASEINFO a
 left outer join AN_SUP_BASEINFO b
on
   a.S_EXT_NODENUM = b.S_EXT_NODENUM and a.ANCHEID   =b.ANCHEID
where
a.ANCHEID= '2c9e87ea5bd35458015c2df4003a1025';*_{color}{color}

the result is : 2017-05-22 00:00:00.0   NULLNULL 

But actually, table AN_SUP_BASEINFO exist an record that  a.S_EXT_NODENUM = 
b.S_EXT_NODENUM and a.ANCHEID   =b.ANCHEID;
If I add a filter condition b.holdingsmsgdis is not null in the sql, the result 
is right:
2017-05-22 00:00:00.0   2   1 
the sql:
select
a.anchedate,b.womempnumdis,b.holdingsmsgdis
from
AN_BASEINFO a
 left outer join AN_SUP_BASEINFO b
on
   a.S_EXT_NODENUM = b.S_EXT_NODENUM and a.ANCHEID   =b.ANCHEID
where
a.ANCHEID= '2c9e87ea5bd35458015c2df4003a1025'{color:#d04437}and 
b.holdingsmsgdis is not null;{color}
result is right: 2017-05-22 00:00:00.0   2   1 

Is there anyone who know this?Please help!




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org