[jira] [Resolved] (SPARK-5756) Analyzer should not throw scala.NotImplementedError for illegitimate sql

2015-02-13 Thread wangfei (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangfei resolved SPARK-5756.

Resolution: Fixed

 Analyzer should not throw scala.NotImplementedError for illegitimate sql
 

 Key: SPARK-5756
 URL: https://issues.apache.org/jira/browse/SPARK-5756
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.2.0
Reporter: wangfei

 ```SELECT CAST(x AS STRING) FROM src```  comes a NotImplementedError:
   CliDriver: scala.NotImplementedError: an implementation is missing
 at scala.Predef$.$qmark$qmark$qmark(Predef.scala:252)
 at 
 org.apache.spark.sql.catalyst.expressions.PrettyAttribute.dataType(namedExpressions.scala:221)
 at 
 org.apache.spark.sql.catalyst.expressions.Cast.resolved$lzycompute(Cast.scala:30)
 at 
 org.apache.spark.sql.catalyst.expressions.Cast.resolved(Cast.scala:30)
 at 
 org.apache.spark.sql.catalyst.expressions.Expression$$anonfun$childrenResolved$1.apply(Expression.scala:68)
 at 
 org.apache.spark.sql.catalyst.expressions.Expression$$anonfun$childrenResolved$1.apply(Expression.scala:68)
 at 
 scala.collection.LinearSeqOptimized$class.exists(LinearSeqOptimized.scala:80)
 at scala.collection.immutable.List.exists(List.scala:84)
 at 
 org.apache.spark.sql.catalyst.expressions.Expression.childrenResolved(Expression.scala:68)
 at 
 org.apache.spark.sql.catalyst.expressions.Expression.resolved$lzycompute(Expression.scala:56)
 at 
 org.apache.spark.sql.catalyst.expressions.Expression.resolved(Expression.scala:56)
 at 
 org.apache.spark.sql.catalyst.expressions.NamedExpression.typeSuffix(namedExpressions.scala:62)
 at 
 org.apache.spark.sql.catalyst.expressions.Alias.toString(namedExpressions.scala:124)
 at 
 org.apache.spark.sql.catalyst.expressions.Expression.prettyString(Expression.scala:78)
 at 
 org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$1$$anonfun$7.apply(Analyzer.scala:83)
 at 
 org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$1$$anonfun$7.apply(Analyzer.scala:83)
 at scala.collection.immutable.Stream.map(Stream.scala:376)
 at 
 org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$1.applyOrElse(Analyzer.scala:83)
 at 
 org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$1.applyOrElse(Analyzer.scala:81)
 at 
 org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:204)
 at 
 org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$.apply(Analyzer.scala:81)
 at 
 org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$.apply(Analyzer.scala:79)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-5756) Analyzer should not throw scala.NotImplementedError for legitimate sql

2015-02-11 Thread wangfei (JIRA)
wangfei created SPARK-5756:
--

 Summary: Analyzer should not throw scala.NotImplementedError for 
legitimate sql
 Key: SPARK-5756
 URL: https://issues.apache.org/jira/browse/SPARK-5756
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.2.0
Reporter: wangfei


```SELECT CAST(x AS STRING) FROM src```  comes a NotImplementedError:
  CliDriver: scala.NotImplementedError: an implementation is missing
at scala.Predef$.$qmark$qmark$qmark(Predef.scala:252)
at 
org.apache.spark.sql.catalyst.expressions.PrettyAttribute.dataType(namedExpressions.scala:221)
at 
org.apache.spark.sql.catalyst.expressions.Cast.resolved$lzycompute(Cast.scala:30)
at 
org.apache.spark.sql.catalyst.expressions.Cast.resolved(Cast.scala:30)
at 
org.apache.spark.sql.catalyst.expressions.Expression$$anonfun$childrenResolved$1.apply(Expression.scala:68)
at 
org.apache.spark.sql.catalyst.expressions.Expression$$anonfun$childrenResolved$1.apply(Expression.scala:68)
at 
scala.collection.LinearSeqOptimized$class.exists(LinearSeqOptimized.scala:80)
at scala.collection.immutable.List.exists(List.scala:84)
at 
org.apache.spark.sql.catalyst.expressions.Expression.childrenResolved(Expression.scala:68)
at 
org.apache.spark.sql.catalyst.expressions.Expression.resolved$lzycompute(Expression.scala:56)
at 
org.apache.spark.sql.catalyst.expressions.Expression.resolved(Expression.scala:56)
at 
org.apache.spark.sql.catalyst.expressions.NamedExpression.typeSuffix(namedExpressions.scala:62)
at 
org.apache.spark.sql.catalyst.expressions.Alias.toString(namedExpressions.scala:124)
at 
org.apache.spark.sql.catalyst.expressions.Expression.prettyString(Expression.scala:78)
at 
org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$1$$anonfun$7.apply(Analyzer.scala:83)
at 
org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$1$$anonfun$7.apply(Analyzer.scala:83)
at scala.collection.immutable.Stream.map(Stream.scala:376)
at 
org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$1.applyOrElse(Analyzer.scala:83)
at 
org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$1.applyOrElse(Analyzer.scala:81)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:204)
at 
org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$.apply(Analyzer.scala:81)
at 
org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$.apply(Analyzer.scala:79)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-5756) Analyzer should not throw scala.NotImplementedError for illegitimate sql

2015-02-11 Thread wangfei (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangfei updated SPARK-5756:
---
Summary: Analyzer should not throw scala.NotImplementedError for 
illegitimate sql  (was: Analyzer should not throw scala.NotImplementedError for 
legitimate sql)

 Analyzer should not throw scala.NotImplementedError for illegitimate sql
 

 Key: SPARK-5756
 URL: https://issues.apache.org/jira/browse/SPARK-5756
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.2.0
Reporter: wangfei

 ```SELECT CAST(x AS STRING) FROM src```  comes a NotImplementedError:
   CliDriver: scala.NotImplementedError: an implementation is missing
 at scala.Predef$.$qmark$qmark$qmark(Predef.scala:252)
 at 
 org.apache.spark.sql.catalyst.expressions.PrettyAttribute.dataType(namedExpressions.scala:221)
 at 
 org.apache.spark.sql.catalyst.expressions.Cast.resolved$lzycompute(Cast.scala:30)
 at 
 org.apache.spark.sql.catalyst.expressions.Cast.resolved(Cast.scala:30)
 at 
 org.apache.spark.sql.catalyst.expressions.Expression$$anonfun$childrenResolved$1.apply(Expression.scala:68)
 at 
 org.apache.spark.sql.catalyst.expressions.Expression$$anonfun$childrenResolved$1.apply(Expression.scala:68)
 at 
 scala.collection.LinearSeqOptimized$class.exists(LinearSeqOptimized.scala:80)
 at scala.collection.immutable.List.exists(List.scala:84)
 at 
 org.apache.spark.sql.catalyst.expressions.Expression.childrenResolved(Expression.scala:68)
 at 
 org.apache.spark.sql.catalyst.expressions.Expression.resolved$lzycompute(Expression.scala:56)
 at 
 org.apache.spark.sql.catalyst.expressions.Expression.resolved(Expression.scala:56)
 at 
 org.apache.spark.sql.catalyst.expressions.NamedExpression.typeSuffix(namedExpressions.scala:62)
 at 
 org.apache.spark.sql.catalyst.expressions.Alias.toString(namedExpressions.scala:124)
 at 
 org.apache.spark.sql.catalyst.expressions.Expression.prettyString(Expression.scala:78)
 at 
 org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$1$$anonfun$7.apply(Analyzer.scala:83)
 at 
 org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$1$$anonfun$7.apply(Analyzer.scala:83)
 at scala.collection.immutable.Stream.map(Stream.scala:376)
 at 
 org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$1.applyOrElse(Analyzer.scala:83)
 at 
 org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$1.applyOrElse(Analyzer.scala:81)
 at 
 org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:204)
 at 
 org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$.apply(Analyzer.scala:81)
 at 
 org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$.apply(Analyzer.scala:79)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-5649) Throw exception when can not apply datatype cast

2015-02-06 Thread wangfei (JIRA)
wangfei created SPARK-5649:
--

 Summary: Throw exception when can not apply datatype cast
 Key: SPARK-5649
 URL: https://issues.apache.org/jira/browse/SPARK-5649
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.2.0
Reporter: wangfei


Throw exception when can not apply datatypes cast to info user the cast issue 
in the sqls. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-5617) test failure of SQLQuerySuite

2015-02-05 Thread wangfei (JIRA)
wangfei created SPARK-5617:
--

 Summary: test failure of SQLQuerySuite
 Key: SPARK-5617
 URL: https://issues.apache.org/jira/browse/SPARK-5617
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.2.0
Reporter: wangfei


SQLQuerySuite test failure: 
[info] - simple select (22 milliseconds)
[info] - sorting (722 milliseconds)
[info] - external sorting (728 milliseconds)
[info] - limit (95 milliseconds)
[info] - date row *** FAILED *** (35 milliseconds)
[info]   Results do not match for query:
[info]   'Limit 1
[info]'Project [CAST(2015-01-28, DateType) AS c0#3630]
[info] 'UnresolvedRelation [testData], None
[info]   
[info]   == Analyzed Plan ==
[info]   Limit 1
[info]Project [CAST(2015-01-28, DateType) AS c0#3630]
[info] LogicalRDD [key#0,value#1], MapPartitionsRDD[1] at mapPartitions at 
ExistingRDD.scala:35
[info]   
[info]   == Physical Plan ==
[info]   Limit 1
[info]Project [16463 AS c0#3630]
[info] PhysicalRDD [key#0,value#1], MapPartitionsRDD[1] at mapPartitions at 
ExistingRDD.scala:35
[info]   
[info]   == Results ==
[info]   !== Correct Answer - 1 ==   == Spark Answer - 1 ==
[info]   ![2015-01-28]   [2015-01-27] (QueryTest.scala:77)
[info]   org.scalatest.exceptions.TestFailedException:
[info]   at 
org.scalatest.Assertions$class.newAssertionFailedException(Assertions.scala:495)
[info]   at 
org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1555)
[info]   at org.scalatest.Assertions$class.fail(Assertions.scala:1328)
[info]   at org.scalatest.FunSuite.fail(FunSuite.scala:1555)
[info]   at org.apache.spark.sql.QueryTest.checkAnswer(QueryTest.scala:77)
[info]   at org.apache.spark.sql.QueryTest.checkAnswer(QueryTest.scala:95)
[info]   at 
org.apache.spark.sql.SQLQuerySuite$$anonfun$23.apply$mcV$sp(SQLQuerySuite.scala:300)
[info]   at 
org.apache.spark.sql.SQLQuerySuite$$anonfun$23.apply(SQLQuerySuite.scala:300)
[info]   at 
org.apache.spark.sql.SQLQuerySuite$$anonfun$23.apply(SQLQuerySuite.scala:300)
[info]   at 
org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
[info]   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
[info]   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
[info]   at org.scalatest.Transformer.apply(Transformer.scala:22)
[info]   at org.scalatest.Transformer.apply(Transformer.scala:20)
[info]   at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
[info]   at org.scalatest.Suite$class.withFixture(Suite.scala:1122)
[info]   at org.scalatest.FunSuite.withFixture(FunSuite.scala:1555)
[info]   at 
org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163)
[info]   at 
org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
[info]   at 
org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
[info]   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
[info]   at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175)
[info]   at org.scalatest.FunSuite.runTest(FunSuite.scala:1555)
[info]   at 
org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
[info]   at 
org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
[info]   at org.scalatest.SuperEngine$$anonfun$traverseSubNode



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-5592) java.net.URISyntaxException when insert data to a partitioned table

2015-02-04 Thread wangfei (JIRA)
wangfei created SPARK-5592:
--

 Summary: java.net.URISyntaxException when insert data to a 
partitioned table  
 Key: SPARK-5592
 URL: https://issues.apache.org/jira/browse/SPARK-5592
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.2.0
Reporter: wangfei


create table sc as select * 
from (select '2011-01-11', '2011-01-11+14:18:26' from src tablesample (1 rows)
  union all 
  select '2011-01-11', '2011-01-11+15:18:26' from src tablesample (1 rows)
  union all 
  select '2011-01-11', '2011-01-11+16:18:26' from src tablesample (1 rows) 
) s;

create table sc_part (key string) partitioned by (ts string) stored as rcfile;

set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;

insert overwrite table sc_part partition(ts) select * from sc;

java.net.URISyntaxException: Relative path in absolute URI: 
ts=2011-01-11+15:18:26
at org.apache.hadoop.fs.Path.initialize(Path.java:206)
at org.apache.hadoop.fs.Path.init(Path.java:172)
at org.apache.hadoop.fs.Path.init(Path.java:94)
at 
org.apache.spark.sql.hive.SparkHiveDynamicPartitionWriterContainer.org$apache$spark$sql$hive$SparkHiveDynamicPartitionWriterContainer$$newWriter$1(hiveWriterContainers.scala:230)
at 
org.apache.spark.sql.hive.SparkHiveDynamicPartitionWriterContainer$$anonfun$getLocalFileWriter$1.apply(hiveWriterContainers.scala:243)
at 
org.apache.spark.sql.hive.SparkHiveDynamicPartitionWriterContainer$$anonfun$getLocalFileWriter$1.apply(hiveWriterContainers.scala:243)
at 
scala.collection.mutable.MapLike$class.getOrElseUpdate(MapLike.scala:189)
at scala.collection.mutable.AbstractMap.getOrElseUpdate(Map.scala:91)
at 
org.apache.spark.sql.hive.SparkHiveDynamicPartitionWriterContainer.getLocalFileWriter(hiveWriterContainers.scala:243)
at 
org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$org$apache$spark$sql$hive$execution$InsertIntoHiveTable$$writeToFile$1$1.apply(InsertIntoHiveTable.scala:113)
at 
org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$org$apache$spark$sql$hive$execution$InsertIntoHiveTable$$writeToFile$1$1.apply(InsertIntoHiveTable.scala:105)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at 
org.apache.spark.sql.hive.execution.InsertIntoHiveTable.org$apache$spark$sql$hive$execution$InsertIntoHiveTable$$writeToFile$1(InsertIntoHiveTable.scala:105)
at 
org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHiveFile$3.apply(InsertIntoHiveTable.scala:87)
at 
org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHiveFile$3.apply(InsertIntoHiveTable.scala:87)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
at org.apache.spark.scheduler.Task.run(Task.scala:64)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:194)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
Caused by: java.net.URISyntaxException: Relative path in absolute URI: 
ts=2011-01-11+15:18:26
at java.net.URI.checkPath(URI.java:1804)
at java.net.URI.init(URI.java:752)
at org.apache.hadoop.fs.Path.initialize(Path.java:203)
... 21 more




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-5591) NoSuchObjectException for CTAS

2015-02-04 Thread wangfei (JIRA)
wangfei created SPARK-5591:
--

 Summary: NoSuchObjectException for CTAS
 Key: SPARK-5591
 URL: https://issues.apache.org/jira/browse/SPARK-5591
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.2.0
Reporter: wangfei


NoSuchObjectException for CTAS, 
create table sc as select * 
from (select '2011-01-11', '2011-01-11+14:18:26' from src tablesample (1 rows)
  union all 
  select '2011-01-11', '2011-01-11+15:18:26' from src tablesample (1 rows)
  union all 
  select '2011-01-11', '2011-01-11+16:18:26' from src tablesample (1 rows) 
) s;

Get this exception:

15/02/04 19:44:02 ERROR Hive: NoSuchObjectException(message:default.sc table 
not found)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1560)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
at $Proxy8.get_table(Unknown Source)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:997)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
at $Proxy9.getTable(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:976)
at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:950)
at 
org.apache.spark.sql.hive.HiveMetastoreCatalog.tableExists(HiveMetastoreCatalog.scala:152)
at 
org.apache.spark.sql.hive.HiveContext$$anon$2.org$apache$spark$sql$catalyst$analysis$OverrideCatalog$$super$tableExists(HiveContext.scala:309)
at 
org.apache.spark.sql.catalyst.analysis.OverrideCatalog$class.tableExists(Catalog.scala:121)
at 
org.apache.spark.sql.hive.HiveContext$$anon$2.tableExists(HiveContext.scala:309)
at 
org.apache.spark.sql.hive.execution.CreateTableAsSelect.run(CreateTableAsSelect.scala:63)
at 
org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:53)




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-5583) Support unique join in hive context

2015-02-03 Thread wangfei (JIRA)
wangfei created SPARK-5583:
--

 Summary: Support unique join in hive context
 Key: SPARK-5583
 URL: https://issues.apache.org/jira/browse/SPARK-5583
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.2.0
Reporter: wangfei


Support unique join in hive context:

FROM UNIQUEJOIN PRESERVE T1 a (a.key), PRESERVE T2 b (b.key), PRESERVE T3 c 
(c.key)
SELECT a.key, b.key, c.key;




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-5587) Support change database owner

2015-02-03 Thread wangfei (JIRA)
wangfei created SPARK-5587:
--

 Summary: Support change database owner 
 Key: SPARK-5587
 URL: https://issues.apache.org/jira/browse/SPARK-5587
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.2.0
Reporter: wangfei


Support change database owner :
create database db_alter_onr;
describe database db_alter_onr;

alter database db_alter_onr set owner user user1;
describe database db_alter_onr;

alter database db_alter_onr set owner role role1;
describe database db_alter_onr;




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-5383) support alias for udfs with multi output columns

2015-01-24 Thread wangfei (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-5383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangfei updated SPARK-5383:
---
Summary: support alias for udfs with multi output columns  (was: Multi 
alias names support)

 support alias for udfs with multi output columns
 

 Key: SPARK-5383
 URL: https://issues.apache.org/jira/browse/SPARK-5383
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.2.0
Reporter: wangfei

 now spark sql does not support multi alias names, The following sql failed in 
 spark-sql:
 select key as (k1, k2), value as (v1, v2) from src limit 5



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-5383) support alias for udfs with multi output columns

2015-01-24 Thread wangfei (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-5383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangfei updated SPARK-5383:
---
Description: 
when a udf output multi columns, now we can not use alias for them in 
spark-sql, see this flowing sql:

select stack(1, key, value, key, value) as (a, b, c, d) from src limit 5;


  was:
now spark sql does not support multi alias names, The following sql failed in 
spark-sql:
select key as (k1, k2), value as (v1, v2) from src limit 5



 support alias for udfs with multi output columns
 

 Key: SPARK-5383
 URL: https://issues.apache.org/jira/browse/SPARK-5383
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.2.0
Reporter: wangfei

 when a udf output multi columns, now we can not use alias for them in 
 spark-sql, see this flowing sql:
 select stack(1, key, value, key, value) as (a, b, c, d) from src limit 5;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-5383) Multi alias names support

2015-01-23 Thread wangfei (JIRA)
wangfei created SPARK-5383:
--

 Summary: Multi alias names support
 Key: SPARK-5383
 URL: https://issues.apache.org/jira/browse/SPARK-5383
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.2.0
Reporter: wangfei


now spark sql does not support multi alias names, The following sql failed in 
spark-sql:
select key as (k1, k2), value as (v1, v2) from src limit 5




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-5367) support star expression in udf

2015-01-22 Thread wangfei (JIRA)
wangfei created SPARK-5367:
--

 Summary: support star expression in udf
 Key: SPARK-5367
 URL: https://issues.apache.org/jira/browse/SPARK-5367
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.2.0
Reporter: wangfei


now spark sql does not support star expression in udf, the following sql will 
get error
`
select concat(*) from src
`




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-5367) support star expression in udf

2015-01-22 Thread wangfei (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-5367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangfei updated SPARK-5367:
---
Description: 
now spark sql does not support star expression in udf, the following sql will 
get error
```
select concat( * ) from src
```


  was:
now spark sql does not support star expression in udf, the following sql will 
get error
`
select concat(*) from src
`



 support star expression in udf
 --

 Key: SPARK-5367
 URL: https://issues.apache.org/jira/browse/SPARK-5367
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.2.0
Reporter: wangfei

 now spark sql does not support star expression in udf, the following sql will 
 get error
 ```
 select concat( * ) from src
 ```



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-5373) literal in agg grouping expressioons leads to incorrect result

2015-01-22 Thread wangfei (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-5373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangfei updated SPARK-5373:
---
Description: select key, count( * ) from src group by key, 1 will get the 
wrong answer!  (was: select key, count(*) from src group by key, 1 will get the 
wrong answer!)

  literal in agg grouping expressioons leads to incorrect result
 ---

 Key: SPARK-5373
 URL: https://issues.apache.org/jira/browse/SPARK-5373
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.2.0
Reporter: wangfei

 select key, count( * ) from src group by key, 1 will get the wrong answer!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-5373) literal in agg grouping expressioons leads to incorrect result

2015-01-22 Thread wangfei (JIRA)
wangfei created SPARK-5373:
--

 Summary:  literal in agg grouping expressioons leads to incorrect 
result
 Key: SPARK-5373
 URL: https://issues.apache.org/jira/browse/SPARK-5373
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.2.0
Reporter: wangfei


select key, count(*) from src group by key, 1 will get the wrong answer!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-5285) Removed GroupExpression in catalyst

2015-01-16 Thread wangfei (JIRA)
wangfei created SPARK-5285:
--

 Summary:  Removed GroupExpression in catalyst
 Key: SPARK-5285
 URL: https://issues.apache.org/jira/browse/SPARK-5285
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.2.0
Reporter: wangfei


 Removed GroupExpression in catalyst



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-5251) Using `tableIdentifier` in hive metastore

2015-01-15 Thread wangfei (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-5251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangfei updated SPARK-5251:
---
Target Version/s: 1.3.0

 Using `tableIdentifier` in hive metastore 
 --

 Key: SPARK-5251
 URL: https://issues.apache.org/jira/browse/SPARK-5251
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.2.0
Reporter: wangfei

 Using `tableIdentifier` in hive metastore 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-5251) Using `tableIdentifier` in hive metastore

2015-01-15 Thread wangfei (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-5251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangfei updated SPARK-5251:
---
Target Version/s:   (was: 1.3.0)

 Using `tableIdentifier` in hive metastore 
 --

 Key: SPARK-5251
 URL: https://issues.apache.org/jira/browse/SPARK-5251
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.2.0
Reporter: wangfei

 Using `tableIdentifier` in hive metastore 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-5251) Using `tableIdentifier` in hive metastore

2015-01-14 Thread wangfei (JIRA)
wangfei created SPARK-5251:
--

 Summary: Using `tableIdentifier` in hive metastore 
 Key: SPARK-5251
 URL: https://issues.apache.org/jira/browse/SPARK-5251
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.2.0
Reporter: wangfei


Using `tableIdentifier` in hive metastore 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-5240) Adding `createDataSourceTable` interface to Catalog

2015-01-13 Thread wangfei (JIRA)
wangfei created SPARK-5240:
--

 Summary: Adding `createDataSourceTable` interface to Catalog
 Key: SPARK-5240
 URL: https://issues.apache.org/jira/browse/SPARK-5240
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.2.0
Reporter: wangfei


Adding `createDataSourceTable` interface to Catalog.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4861) Refactory command in spark sql

2015-01-11 Thread wangfei (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14272862#comment-14272862
 ] 

wangfei commented on SPARK-4861:


[~yhuai]of course if possible, but i have not find a way to remove it since in 
HiveCommandStrategy we need to distinguish hive metastore table and temporary 
table, so now still keep HiveCommandStrategy there. any idea here?

 Refactory command in spark sql
 --

 Key: SPARK-4861
 URL: https://issues.apache.org/jira/browse/SPARK-4861
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.1.1
Reporter: wangfei
 Fix For: 1.3.0


 Fix a todo in spark sql:  remove ```Command``` and use ```RunnableCommand``` 
 instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4572) [SQL] spark-sql exits while encountered an error

2015-01-09 Thread wangfei (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14270960#comment-14270960
 ] 

wangfei commented on SPARK-4572:


which version you get this error? it should be fixed already.

 [SQL] spark-sql exits while encountered an error 
 -

 Key: SPARK-4572
 URL: https://issues.apache.org/jira/browse/SPARK-4572
 Project: Spark
  Issue Type: Bug
  Components: SQL
Reporter: Fuqing Yang
   Original Estimate: 1h
  Remaining Estimate: 1h

 while using spark-sql, found it usually exits while sql failed, and you need 
 to rerun spark-sql . 
 This is not convenient , we should catch the exceptions to change its default 
 action. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4574) Adding support for defining schema in foreign DDL commands.

2015-01-09 Thread wangfei (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14270967#comment-14270967
 ] 

wangfei commented on SPARK-4574:


[~pwendell] get it, thanks.

 Adding support for defining schema in foreign DDL commands.
 ---

 Key: SPARK-4574
 URL: https://issues.apache.org/jira/browse/SPARK-4574
 Project: Spark
  Issue Type: New Feature
  Components: SQL
Affects Versions: 1.1.0
Reporter: wangfei

 Adding support for defining schema in foreign DDL commands. Now foreign DDL 
 support commands like:
CREATE TEMPORARY TABLE avroTable
USING org.apache.spark.sql.avro
OPTIONS (path ../hive/src/test/resources/data/files/episodes.avro)
 Let user can define schema instead of infer from file, so we can support ddl 
 command as follows:
CREATE TEMPORARY TABLE avroTable(a int, b string)
USING org.apache.spark.sql.avro
OPTIONS (path ../hive/src/test/resources/data/files/episodes.avro)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1442) Add Window function support

2015-01-09 Thread wangfei (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14270973#comment-14270973
 ] 

wangfei commented on SPARK-1442:


why the two PR both closed?

 Add Window function support
 ---

 Key: SPARK-1442
 URL: https://issues.apache.org/jira/browse/SPARK-1442
 Project: Spark
  Issue Type: New Feature
  Components: SQL
Reporter: Chengxiang Li
 Attachments: Window Function.pdf


 similiar to Hive, add window function support for catalyst.
 https://issues.apache.org/jira/browse/HIVE-4197
 https://issues.apache.org/jira/browse/HIVE-896



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Closed] (SPARK-4673) Optimizing limit using coalesce

2015-01-09 Thread wangfei (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-4673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangfei closed SPARK-4673.
--
Resolution: Fixed

since coalesce (1) will lead to run with a single thread, not always speed up 
limit so close this one.

 Optimizing limit using coalesce
 ---

 Key: SPARK-4673
 URL: https://issues.apache.org/jira/browse/SPARK-4673
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.1.0
Reporter: wangfei
 Fix For: 1.3.0


 Now limit used ShuffledRDD and HashPartitioner to repartition 1 which leads 
 to shuffle.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Closed] (SPARK-5000) Alias support string literal in spark sql

2015-01-08 Thread wangfei (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-5000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangfei closed SPARK-5000.
--
Resolution: Fixed

 Alias support string literal in spark sql
 -

 Key: SPARK-5000
 URL: https://issues.apache.org/jira/browse/SPARK-5000
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.2.0
Reporter: wangfei

 Alias support string literal in spark sql parser:
 select key , value as 'vvv' from tableA;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5000) Alias support string literal in spark sql

2015-01-08 Thread wangfei (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14270718#comment-14270718
 ] 

wangfei commented on SPARK-5000:


backticks can do this, so close this one.

 Alias support string literal in spark sql
 -

 Key: SPARK-5000
 URL: https://issues.apache.org/jira/browse/SPARK-5000
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.2.0
Reporter: wangfei

 Alias support string literal in spark sql parser:
 select key , value as 'vvv' from tableA;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-5165) Add support for rollup and cube in sqlcontext

2015-01-08 Thread wangfei (JIRA)
wangfei created SPARK-5165:
--

 Summary: Add support for rollup and cube in sqlcontext
 Key: SPARK-5165
 URL: https://issues.apache.org/jira/browse/SPARK-5165
 Project: Spark
  Issue Type: New Feature
  Components: SQL
Affects Versions: 1.2.0
Reporter: wangfei


Add support for rollup and cube in sqlcontext



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-5029) Enable from follow multiple brackets

2014-12-30 Thread wangfei (JIRA)
wangfei created SPARK-5029:
--

 Summary: Enable from follow multiple brackets
 Key: SPARK-5029
 URL: https://issues.apache.org/jira/browse/SPARK-5029
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.2.0
Reporter: wangfei


Enable from follow multiple brackets:
such as :
select key from ((select * from testData limit 1) union all (select * from 
testData limit 1)) x limit 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-5000) Alias support string literal in spark sql parser

2014-12-29 Thread wangfei (JIRA)
wangfei created SPARK-5000:
--

 Summary: Alias support string literal in spark sql parser
 Key: SPARK-5000
 URL: https://issues.apache.org/jira/browse/SPARK-5000
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.2.0
Reporter: wangfei


Alias support string literal in spark sql parser:

select key , value as 'vvv' from tableA;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-5000) Alias support string literal in spark sql

2014-12-29 Thread wangfei (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-5000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangfei updated SPARK-5000:
---
Summary: Alias support string literal in spark sql  (was: Alias support 
string literal in spark sql parser)

 Alias support string literal in spark sql
 -

 Key: SPARK-5000
 URL: https://issues.apache.org/jira/browse/SPARK-5000
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.2.0
Reporter: wangfei

 Alias support string literal in spark sql parser:
 select key , value as 'vvv' from tableA;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-4984) add a pop-up containing the full for job description when it is very long

2014-12-28 Thread wangfei (JIRA)
wangfei created SPARK-4984:
--

 Summary: add a pop-up containing the full for job description when 
it is very long
 Key: SPARK-4984
 URL: https://issues.apache.org/jira/browse/SPARK-4984
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 1.2.0
Reporter: wangfei


add a pop-up containing the full for job description when it is very long



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-4975) HiveInspectorSuite test failure

2014-12-27 Thread wangfei (JIRA)
wangfei created SPARK-4975:
--

 Summary: HiveInspectorSuite test failure
 Key: SPARK-4975
 URL: https://issues.apache.org/jira/browse/SPARK-4975
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.2.0
Reporter: wangfei


HiveInspectorSuite test failure:
[info] - wrap / unwrap null, constant null and writables *** FAILED *** (21 
milliseconds)
[info]   1 did not equal 0 (HiveInspectorSuite.scala:136)
[info]   org.scalatest.exceptions.TestFailedException:
[info]   at 
org.scalatest.Assertions$class.newAssertionFailedException(Assertions.scala:500)
[info]   at 
org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1555)
[info]   at 
org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:466)
[info]   at 
org.apache.spark.sql.hive.HiveInspectorSuite.checkValues(HiveInspectorSuite.scala:136)
[info]   at 
org.apache.spark.sql.hive.HiveInspectorSuite$$anonfun$checkValues$1.apply(HiveInspectorSuite.scala:124)
[info]   at 
org.apache.spark.sql.hive.HiveInspectorSuite$$anonfun$checkValues$1.apply(HiveInspectorSuite.scala:123)
[info]   at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
[info]   at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
[info]   at scala.collection.immutable.List.foreach(List.scala:318)
[info]   at 
scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
[info]   at scala.collection.AbstractTraversable.map(Traversable.scala:105)
[info]   at 
org.apache.spark.sql.hive.HiveInspectorSuite.checkValues(HiveInspectorSuite.scala:123)
[info]   at 
org.apache.spark.sql.hive.HiveInspectorSuite$$anonfun$3.apply$mcV$sp(HiveInspectorSuite.scala:163)
[info]   at 
org.apache.spark.sql.hive.HiveInspectorSuite$$anonfun$3.apply(HiveInspectorSuite.scala:148)
[info]   at 
org.apache.spark.sql.hive.HiveInspectorSuite$$anonfun$3.apply(HiveInspectorSuite.scala:148)
[info]   at 
org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
[info]   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
[info]   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
[info]   at org.scalatest.Transformer.apply(Transformer.scala:22)
[info]   at org.scalatest.Transformer.apply(Transformer.scala:20)
[info]   at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
[info]   at org.scalatest.Suite$class.withFixture(Suite.scala:1122)
[info]   at org.scalatest.FunSuite.withFixture(FunSuite.scala:1555)
[info]   at 
org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163)
[info]   at 
org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
[info]   at 
org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
[info]   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
[info]   at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175)
[info]   at org.scalatest.FunSuite.runTest(FunSuite.scala:1555)
[info]   at 
org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
[info]   at 
org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
[info]   at 
org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413)
[info]   at 
org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401)
[info]   at scala.collection.immutable.List.foreach(List.scala:318)
[info]   at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
[info]   at 
org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:396)
[info]   at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:483)
[info]   at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:208)
[info]   at org.scalatest.FunSuite.runTests(FunSuite.scala:1555)
[info]   at org.scalatest.Suite$class.run(Suite.scala:1424)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-4935) When hive.cli.print.header configured, spark-sql aborted if passed in a invalid sql

2014-12-23 Thread wangfei (JIRA)
wangfei created SPARK-4935:
--

 Summary: When hive.cli.print.header configured, spark-sql aborted 
if passed in a invalid sql
 Key: SPARK-4935
 URL: https://issues.apache.org/jira/browse/SPARK-4935
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.1.0, 1.2.0
Reporter: wangfei
 Fix For: 1.3.0


When hive.cli.print.header configured, spark-sql aborted if passed in a invalid 
sql



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-4937) Adding optimization to simplify the filter condition

2014-12-23 Thread wangfei (JIRA)
wangfei created SPARK-4937:
--

 Summary: Adding optimization to simplify the filter condition
 Key: SPARK-4937
 URL: https://issues.apache.org/jira/browse/SPARK-4937
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.1.0
Reporter: wangfei
 Fix For: 1.3.0


Adding optimization to simplify the filter condition:
1  condition that can get the boolean result such as:
a  3  a  5   False
a  1 || a  0 True

2 Simplify And, Or condition, such as the sql (one of hive-testbench
):
select
sum(l_extendedprice* (1 - l_discount)) as revenue
from
lineitem,
part
where
(
p_partkey = l_partkey
and p_brand = 'Brand#32'
and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG')
and l_quantity = 7 and l_quantity = 7 + 10
and p_size between 1 and 5
and l_shipmode in ('AIR', 'AIR REG')
and l_shipinstruct = 'DELIVER IN PERSON'
)
or
(
p_partkey = l_partkey
and p_brand = 'Brand#35'
and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK')
and l_quantity = 15 and l_quantity = 15 + 10
and p_size between 1 and 10
and l_shipmode in ('AIR', 'AIR REG')
and l_shipinstruct = 'DELIVER IN PERSON'
)
or
(
p_partkey = l_partkey
and p_brand = 'Brand#24'
and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG')
and l_quantity = 26 and l_quantity = 26 + 10
and p_size between 1 and 15
and l_shipmode in ('AIR', 'AIR REG')
and l_shipinstruct = 'DELIVER IN PERSON'
);
 Before optimized it is a CartesianProduct, in my locally test this sql hang 
and can not get result, after optimization the CartesianProduct replaced by 
ShuffledHashJoin, which only need 20+ seconds to run this sql.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-4938) Adding optimization to simplify the filter condition

2014-12-23 Thread wangfei (JIRA)
wangfei created SPARK-4938:
--

 Summary: Adding optimization to simplify the filter condition
 Key: SPARK-4938
 URL: https://issues.apache.org/jira/browse/SPARK-4938
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.1.0
Reporter: wangfei
 Fix For: 1.3.0


Adding optimization to simplify the filter condition:
1  condition that can get the boolean result such as:
a  3  a  5   False
a  1 || a  0 True

2 Simplify And, Or condition, such as the sql (one of hive-testbench
):
select
sum(l_extendedprice* (1 - l_discount)) as revenue
from
lineitem,
part
where
(
p_partkey = l_partkey
and p_brand = 'Brand#32'
and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG')
and l_quantity = 7 and l_quantity = 7 + 10
and p_size between 1 and 5
and l_shipmode in ('AIR', 'AIR REG')
and l_shipinstruct = 'DELIVER IN PERSON'
)
or
(
p_partkey = l_partkey
and p_brand = 'Brand#35'
and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK')
and l_quantity = 15 and l_quantity = 15 + 10
and p_size between 1 and 10
and l_shipmode in ('AIR', 'AIR REG')
and l_shipinstruct = 'DELIVER IN PERSON'
)
or
(
p_partkey = l_partkey
and p_brand = 'Brand#24'
and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG')
and l_quantity = 26 and l_quantity = 26 + 10
and p_size between 1 and 15
and l_shipmode in ('AIR', 'AIR REG')
and l_shipinstruct = 'DELIVER IN PERSON'
);
 Before optimized it is a CartesianProduct, in my locally test this sql hang 
and can not get result, after optimization the CartesianProduct replaced by 
ShuffledHashJoin, which only need 20+ seconds to run this sql.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4938) Adding optimization to simplify the filter condition

2014-12-23 Thread wangfei (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14257042#comment-14257042
 ] 

wangfei commented on SPARK-4938:


Duplicate

 Adding optimization to simplify the filter condition
 

 Key: SPARK-4938
 URL: https://issues.apache.org/jira/browse/SPARK-4938
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.1.0
Reporter: wangfei
 Fix For: 1.3.0


 Adding optimization to simplify the filter condition:
 1  condition that can get the boolean result such as:
 a  3  a  5   False
 a  1 || a  0 True
 2 Simplify And, Or condition, such as the sql (one of hive-testbench
 ):
 select
 sum(l_extendedprice* (1 - l_discount)) as revenue
 from
 lineitem,
 part
 where
 (
 p_partkey = l_partkey
 and p_brand = 'Brand#32'
 and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG')
 and l_quantity = 7 and l_quantity = 7 + 10
 and p_size between 1 and 5
 and l_shipmode in ('AIR', 'AIR REG')
 and l_shipinstruct = 'DELIVER IN PERSON'
 )
 or
 (
 p_partkey = l_partkey
 and p_brand = 'Brand#35'
 and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK')
 and l_quantity = 15 and l_quantity = 15 + 10
 and p_size between 1 and 10
 and l_shipmode in ('AIR', 'AIR REG')
 and l_shipinstruct = 'DELIVER IN PERSON'
 )
 or
 (
 p_partkey = l_partkey
 and p_brand = 'Brand#24'
 and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG')
 and l_quantity = 26 and l_quantity = 26 + 10
 and p_size between 1 and 15
 and l_shipmode in ('AIR', 'AIR REG')
 and l_shipinstruct = 'DELIVER IN PERSON'
 );
  Before optimized it is a CartesianProduct, in my locally test this sql hang 
 and can not get result, after optimization the CartesianProduct replaced by 
 ShuffledHashJoin, which only need 20+ seconds to run this sql.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-4861) Refactory command in spark sql

2014-12-16 Thread wangfei (JIRA)
wangfei created SPARK-4861:
--

 Summary: Refactory command in spark sql
 Key: SPARK-4861
 URL: https://issues.apache.org/jira/browse/SPARK-4861
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.1.1
Reporter: wangfei
 Fix For: 1.3.0


Fix a todo in spark sql:  remove ```Command``` and use ```RunnableCommand``` 
instead.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-4845) Adding a parallelismRatio to control the partitions num of shuffledRDD

2014-12-14 Thread wangfei (JIRA)
wangfei created SPARK-4845:
--

 Summary: Adding a parallelismRatio to control the partitions num 
of shuffledRDD
 Key: SPARK-4845
 URL: https://issues.apache.org/jira/browse/SPARK-4845
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 1.1.0
Reporter: wangfei
 Fix For: 1.3.0


Adding parallelismRatio to control the partitions num of shuffledRDD, the rule 
is:

 Math.max(1, parallelismRatio * number of partitions of the largest upstream 
RDD)
The ratio is 1.0 by default to make it compatible with the old version. When we 
have a good experience on it, we can change this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-4695) Get result using executeCollect in spark sql

2014-12-02 Thread wangfei (JIRA)
wangfei created SPARK-4695:
--

 Summary:  Get result using executeCollect in spark sql 
 Key: SPARK-4695
 URL: https://issues.apache.org/jira/browse/SPARK-4695
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.1.0
Reporter: wangfei
 Fix For: 1.3.0


We should use executeCollect to collect the result, because executeCollect is a 
custom implementation of collect in spark sql which better than rdd's collect



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-4695) Get result using executeCollect in spark sql

2014-12-02 Thread wangfei (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-4695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangfei updated SPARK-4695:
---
Issue Type: Improvement  (was: Bug)

  Get result using executeCollect in spark sql 
 --

 Key: SPARK-4695
 URL: https://issues.apache.org/jira/browse/SPARK-4695
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.1.0
Reporter: wangfei
 Fix For: 1.3.0


 We should use executeCollect to collect the result, because executeCollect is 
 a custom implementation of collect in spark sql which better than rdd's 
 collect



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-4673) Optimizing limit using coalesce

2014-11-30 Thread wangfei (JIRA)
wangfei created SPARK-4673:
--

 Summary: Optimizing limit using coalesce
 Key: SPARK-4673
 URL: https://issues.apache.org/jira/browse/SPARK-4673
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.1.0
Reporter: wangfei
 Fix For: 1.3.0


Now limit used ShuffledRDD and HashPartitioner to repartition 1 which leads to 
shuffle.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-4618) Make foreign DDL commands options case-insensitive

2014-11-25 Thread wangfei (JIRA)
wangfei created SPARK-4618:
--

 Summary: Make foreign DDL commands options case-insensitive
 Key: SPARK-4618
 URL: https://issues.apache.org/jira/browse/SPARK-4618
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.1.0
Reporter: wangfei
 Fix For: 1.3.0


Make foreign DDL commands options case-insensitive
So flowing cmd worked
```
  create temporary table normal_parquet
  USING org.apache.spark.sql.parquet
  OPTIONS (
PATH '/xxx/data'
  )
``` 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-4574) Adding support for defining schema in foreign DDL commands.

2014-11-24 Thread wangfei (JIRA)
wangfei created SPARK-4574:
--

 Summary: Adding support for defining schema in foreign DDL 
commands.
 Key: SPARK-4574
 URL: https://issues.apache.org/jira/browse/SPARK-4574
 Project: Spark
  Issue Type: New Feature
  Components: SQL
Affects Versions: 1.1.0
Reporter: wangfei
 Fix For: 1.2.0


Adding support for defining schema in foreign DDL commands. Now foreign DDL 
support commands like:
   CREATE TEMPORARY TABLE avroTable
   USING org.apache.spark.sql.avro
   OPTIONS (path ../hive/src/test/resources/data/files/episodes.avro)

Let user can define schema instead of infer from file, so we can support ddl 
command as follows:
   CREATE TEMPORARY TABLE avroTable(a int, b string)
   USING org.apache.spark.sql.avro
   OPTIONS (path ../hive/src/test/resources/data/files/episodes.avro)






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-4552) query for empty parquet table in spark sql hive get IllegalArgumentException

2014-11-22 Thread wangfei (JIRA)
wangfei created SPARK-4552:
--

 Summary: query for empty parquet table in spark sql hive get 
IllegalArgumentException
 Key: SPARK-4552
 URL: https://issues.apache.org/jira/browse/SPARK-4552
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.1.0
Reporter: wangfei
 Fix For: 1.2.0


run
create table test_parquet(key int, value string) stored as parquet;
select * from test_parquet;
get error as follow

java.lang.IllegalArgumentException: Could not find Parquet metadata at path 
file:/user/hive/warehouse/test_parquet
at 
org.apache.spark.sql.parquet.ParquetTypesConverter$$anonfun$readMetaData$4.apply(ParquetTypes.scala:459)
at 
org.apache.spark.sql.parquet.ParquetTypesConverter$$anonfun$readMetaData$4.apply(ParquetTypes.scala:459)
at scala.Option.getOrElse(Option.scala:120)
at 
org.apache.spark.sql.parquet.ParquetTypesConverter$.readMetaData(ParquetTypes.sc



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-4553) query for parquet table with string fields in spark sql hive get binary result

2014-11-22 Thread wangfei (JIRA)
wangfei created SPARK-4553:
--

 Summary: query for parquet table with string fields in spark sql 
hive get binary result
 Key: SPARK-4553
 URL: https://issues.apache.org/jira/browse/SPARK-4553
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.1.0
Reporter: wangfei
 Fix For: 1.2.0


run 
create table test_parquet(key int, value string) stored as parquet;
insert into table test_parquet select * from src;
select * from test_parquet;
get result as follow

...
282 [B@38fda3b
138 [B@1407a24
238 [B@12de6fb
419 [B@6c97695
15 [B@4885067
118 [B@156a8d3
72 [B@65d20dd
90 [B@4c18906
307 [B@60b24cc
19 [B@59cf51b
435 [B@39fdf37
10 [B@4f799d7
277 [B@3950951
273 [B@596bf4b
306 [B@3e91557
224 [B@3781d61
309 [B@2d0d128



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-4554) Set fair scheduler pool for JDBC client session in hive 13

2014-11-22 Thread wangfei (JIRA)
wangfei created SPARK-4554:
--

 Summary: Set fair scheduler pool for JDBC client session in hive 13
 Key: SPARK-4554
 URL: https://issues.apache.org/jira/browse/SPARK-4554
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.1.0
Reporter: wangfei
 Fix For: 1.2.0


Now hive 13 shim does not support to set fair scheduler pool 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-4559) Adding support for ucase and lcase

2014-11-22 Thread wangfei (JIRA)
wangfei created SPARK-4559:
--

 Summary: Adding support for ucase and lcase
 Key: SPARK-4559
 URL: https://issues.apache.org/jira/browse/SPARK-4559
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.1.0
Reporter: wangfei
 Fix For: 1.2.0


Adding support for ucase and lcase in spark sql



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-4449) specify port range in spark

2014-11-17 Thread wangfei (JIRA)
wangfei created SPARK-4449:
--

 Summary: specify port range in spark
 Key: SPARK-4449
 URL: https://issues.apache.org/jira/browse/SPARK-4449
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.1.0
Reporter: wangfei
 Fix For: 1.2.0


 In some case, we need specify port range used in spark.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-4443) Statistics bug for external table in spark sql hive

2014-11-16 Thread wangfei (JIRA)
wangfei created SPARK-4443:
--

 Summary: Statistics bug for external table in spark sql hive
 Key: SPARK-4443
 URL: https://issues.apache.org/jira/browse/SPARK-4443
 Project: Spark
  Issue Type: Bug
  Components: SQL
Reporter: wangfei






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-4443) Statistics bug for external table in spark sql hive

2014-11-16 Thread wangfei (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangfei updated SPARK-4443:
---
Description: When table is external, the `totalSize` is always zero, which 
will influence join strategy(always use broadcast join for external table)  
(was: When table is external, `totalSize` is always zero, which will influence 
join strategy(always use broadcast join for external table))

 Statistics bug for external table in spark sql hive
 ---

 Key: SPARK-4443
 URL: https://issues.apache.org/jira/browse/SPARK-4443
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.1.0
Reporter: wangfei
 Fix For: 1.2.0


 When table is external, the `totalSize` is always zero, which will influence 
 join strategy(always use broadcast join for external table)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-4443) Statistics bug for external table in spark sql hive

2014-11-16 Thread wangfei (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangfei updated SPARK-4443:
---
  Description: When table is external, `totalSize` is always zero, 
which will influence join strategy(always use broadcast join for external table)
 Target Version/s: 1.2.0
Affects Version/s: 1.1.0
Fix Version/s: 1.2.0

 Statistics bug for external table in spark sql hive
 ---

 Key: SPARK-4443
 URL: https://issues.apache.org/jira/browse/SPARK-4443
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.1.0
Reporter: wangfei
 Fix For: 1.2.0


 When table is external, `totalSize` is always zero, which will influence join 
 strategy(always use broadcast join for external table)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-4292) incorrect result set in JDBC/ODBC

2014-11-06 Thread wangfei (JIRA)
wangfei created SPARK-4292:
--

 Summary: incorrect result set in JDBC/ODBC
 Key: SPARK-4292
 URL: https://issues.apache.org/jira/browse/SPARK-4292
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.1.0
Reporter: wangfei
 Fix For: 1.2.0


select * from src, get result as follows:
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-4261) make right version info for beeline

2014-11-05 Thread wangfei (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangfei updated SPARK-4261:
---
Description: 
Running with spark sql jdbc/odbc, the output will be
JackydeMacBook-Pro:spark1 jackylee$ bin/beeline 
Spark assembly has been built with Hive, including Datanucleus jars on classpath
Beeline version ??? by Apache Hive
we should make right version info for beeline

 make right version info for beeline
 ---

 Key: SPARK-4261
 URL: https://issues.apache.org/jira/browse/SPARK-4261
 Project: Spark
  Issue Type: Bug
  Components: Build, SQL
Affects Versions: 1.1.0
Reporter: wangfei
 Fix For: 1.2.0


 Running with spark sql jdbc/odbc, the output will be
 JackydeMacBook-Pro:spark1 jackylee$ bin/beeline 
 Spark assembly has been built with Hive, including Datanucleus jars on 
 classpath
 Beeline version ??? by Apache Hive
 we should make right version info for beeline



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-4261) make right version info for beeline

2014-11-05 Thread wangfei (JIRA)
wangfei created SPARK-4261:
--

 Summary: make right version info for beeline
 Key: SPARK-4261
 URL: https://issues.apache.org/jira/browse/SPARK-4261
 Project: Spark
  Issue Type: Bug
  Components: Build, SQL
Affects Versions: 1.1.0
Reporter: wangfei
 Fix For: 1.2.0






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-4237) Generate right Manifest File for maven building

2014-11-05 Thread wangfei (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-4237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangfei updated SPARK-4237:
---
Description: 
Now build spark with maven produce the Manifest File of guava,
we should make right Manifest File for Maven building



  was:
Running with spark sql jdbc/odbc, the output will be 

JackydeMacBook-Pro:spark1 jackylee$ bin/beeline 
Spark assembly has been built with Hive, including Datanucleus jars on classpath
Beeline version ??? by Apache Hive

we should add Manifest File for Maven building




 Generate right Manifest File for maven building
 ---

 Key: SPARK-4237
 URL: https://issues.apache.org/jira/browse/SPARK-4237
 Project: Spark
  Issue Type: Bug
  Components: Build
Affects Versions: 1.1.0
Reporter: wangfei
 Fix For: 1.2.0


 Now build spark with maven produce the Manifest File of guava,
 we should make right Manifest File for Maven building



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-4225) jdbc/odbc error when using maven build spark

2014-11-04 Thread wangfei (JIRA)
wangfei created SPARK-4225:
--

 Summary: jdbc/odbc error when using maven build spark
 Key: SPARK-4225
 URL: https://issues.apache.org/jira/browse/SPARK-4225
 Project: Spark
  Issue Type: Bug
  Components: Build, SQL
Affects Versions: 1.1.0
Reporter: wangfei
Priority: Blocker
 Fix For: 1.2.0


use command as follows to build spark
mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.1 -Phive -DskipTests clean package

then use beeline to connect to thrift server ,get this error:
 
14/11/04 11:30:31 INFO ObjectStore: Initialized ObjectStore
14/11/04 11:30:31 INFO AbstractService: Service:ThriftBinaryCLIService is 
started.
14/11/04 11:30:31 INFO AbstractService: Service:HiveServer2 is started.
14/11/04 11:30:31 INFO HiveThriftServer2: HiveThriftServer2 started
14/11/04 11:30:31 INFO ThriftCLIService: ThriftBinaryCLIService listening on 
0.0.0.0/0.0.0.0:1
14/11/04 11:33:26 INFO ThriftCLIService: Client protocol version: 
HIVE_CLI_SERVICE_PROTOCOL_V6
14/11/04 11:33:26 INFO HiveMetaStore: No user is added in admin role, since 
config is empty
14/11/04 11:33:26 INFO SessionState: No Tez session required at this point. 
hive.execution.engine=mr.
14/11/04 11:33:26 INFO SessionState: No Tez session required at this point. 
hive.execution.engine=mr.
14/11/04 11:33:26 ERROR TThreadPoolServer: Thrift error occurred during 
processing of message.
org.apache.thrift.protocol.TProtocolException: Cannot write a TUnion with no 
set value!
at org.apache.thrift.TUnion$TUnionStandardScheme.write(TUnion.java:240)
at org.apache.thrift.TUnion$TUnionStandardScheme.write(TUnion.java:213)
at org.apache.thrift.TUnion.write(TUnion.java:152)
at 
org.apache.hive.service.cli.thrift.TGetInfoResp$TGetInfoRespStandardScheme.write(TGetInfoResp.java:456)
at 
org.apache.hive.service.cli.thrift.TGetInfoResp$TGetInfoRespStandardScheme.write(TGetInfoResp.java:406)
at 
org.apache.hive.service.cli.thrift.TGetInfoResp.write(TGetInfoResp.java:341)
at 
org.apache.hive.service.cli.thrift.TCLIService$GetInfo_result$GetInfo_resultStandardScheme.write(TCLIService.java:3754)
at 
org.apache.hive.service.cli.thrift.TCLIService$GetInfo_result$GetInfo_resultStandardScheme.write(TCLIService.java:3718)
at 
org.apache.hive.service.cli.thrift.TCLIService$GetInfo_result.write(TCLIService.java:3669)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at 
org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:55)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4225) jdbc/odbc error when using maven build spark

2014-11-04 Thread wangfei (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14196693#comment-14196693
 ] 

wangfei commented on SPARK-4225:


it seems there is some difference between using sbt and maven.

 jdbc/odbc error when using maven build spark
 

 Key: SPARK-4225
 URL: https://issues.apache.org/jira/browse/SPARK-4225
 Project: Spark
  Issue Type: Bug
  Components: Build, SQL
Affects Versions: 1.1.0
Reporter: wangfei
Priority: Blocker
 Fix For: 1.2.0


 use command as follows to build spark
 mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.1 -Phive -DskipTests clean 
 package
 then use beeline to connect to thrift server ,get this error:
  
 14/11/04 11:30:31 INFO ObjectStore: Initialized ObjectStore
 14/11/04 11:30:31 INFO AbstractService: Service:ThriftBinaryCLIService is 
 started.
 14/11/04 11:30:31 INFO AbstractService: Service:HiveServer2 is started.
 14/11/04 11:30:31 INFO HiveThriftServer2: HiveThriftServer2 started
 14/11/04 11:30:31 INFO ThriftCLIService: ThriftBinaryCLIService listening on 
 0.0.0.0/0.0.0.0:1
 14/11/04 11:33:26 INFO ThriftCLIService: Client protocol version: 
 HIVE_CLI_SERVICE_PROTOCOL_V6
 14/11/04 11:33:26 INFO HiveMetaStore: No user is added in admin role, since 
 config is empty
 14/11/04 11:33:26 INFO SessionState: No Tez session required at this point. 
 hive.execution.engine=mr.
 14/11/04 11:33:26 INFO SessionState: No Tez session required at this point. 
 hive.execution.engine=mr.
 14/11/04 11:33:26 ERROR TThreadPoolServer: Thrift error occurred during 
 processing of message.
 org.apache.thrift.protocol.TProtocolException: Cannot write a TUnion with no 
 set value!
   at org.apache.thrift.TUnion$TUnionStandardScheme.write(TUnion.java:240)
   at org.apache.thrift.TUnion$TUnionStandardScheme.write(TUnion.java:213)
   at org.apache.thrift.TUnion.write(TUnion.java:152)
   at 
 org.apache.hive.service.cli.thrift.TGetInfoResp$TGetInfoRespStandardScheme.write(TGetInfoResp.java:456)
   at 
 org.apache.hive.service.cli.thrift.TGetInfoResp$TGetInfoRespStandardScheme.write(TGetInfoResp.java:406)
   at 
 org.apache.hive.service.cli.thrift.TGetInfoResp.write(TGetInfoResp.java:341)
   at 
 org.apache.hive.service.cli.thrift.TCLIService$GetInfo_result$GetInfo_resultStandardScheme.write(TCLIService.java:3754)
   at 
 org.apache.hive.service.cli.thrift.TCLIService$GetInfo_result$GetInfo_resultStandardScheme.write(TCLIService.java:3718)
   at 
 org.apache.hive.service.cli.thrift.TCLIService$GetInfo_result.write(TCLIService.java:3669)
   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53)
   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
   at 
 org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:55)
   at 
 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-4225) jdbc/odbc error when using maven build spark

2014-11-04 Thread wangfei (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14196693#comment-14196693
 ] 

wangfei edited comment on SPARK-4225 at 11/4/14 7:46 PM:
-

it seems there is some difference between using sbt and maven when building 
spark.


was (Author: scwf):
it seems there is some difference between using sbt and maven.

 jdbc/odbc error when using maven build spark
 

 Key: SPARK-4225
 URL: https://issues.apache.org/jira/browse/SPARK-4225
 Project: Spark
  Issue Type: Bug
  Components: Build, SQL
Affects Versions: 1.1.0
Reporter: wangfei
Priority: Blocker
 Fix For: 1.2.0


 use command as follows to build spark
 mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.1 -Phive -DskipTests clean 
 package
 then use beeline to connect to thrift server ,get this error:
  
 14/11/04 11:30:31 INFO ObjectStore: Initialized ObjectStore
 14/11/04 11:30:31 INFO AbstractService: Service:ThriftBinaryCLIService is 
 started.
 14/11/04 11:30:31 INFO AbstractService: Service:HiveServer2 is started.
 14/11/04 11:30:31 INFO HiveThriftServer2: HiveThriftServer2 started
 14/11/04 11:30:31 INFO ThriftCLIService: ThriftBinaryCLIService listening on 
 0.0.0.0/0.0.0.0:1
 14/11/04 11:33:26 INFO ThriftCLIService: Client protocol version: 
 HIVE_CLI_SERVICE_PROTOCOL_V6
 14/11/04 11:33:26 INFO HiveMetaStore: No user is added in admin role, since 
 config is empty
 14/11/04 11:33:26 INFO SessionState: No Tez session required at this point. 
 hive.execution.engine=mr.
 14/11/04 11:33:26 INFO SessionState: No Tez session required at this point. 
 hive.execution.engine=mr.
 14/11/04 11:33:26 ERROR TThreadPoolServer: Thrift error occurred during 
 processing of message.
 org.apache.thrift.protocol.TProtocolException: Cannot write a TUnion with no 
 set value!
   at org.apache.thrift.TUnion$TUnionStandardScheme.write(TUnion.java:240)
   at org.apache.thrift.TUnion$TUnionStandardScheme.write(TUnion.java:213)
   at org.apache.thrift.TUnion.write(TUnion.java:152)
   at 
 org.apache.hive.service.cli.thrift.TGetInfoResp$TGetInfoRespStandardScheme.write(TGetInfoResp.java:456)
   at 
 org.apache.hive.service.cli.thrift.TGetInfoResp$TGetInfoRespStandardScheme.write(TGetInfoResp.java:406)
   at 
 org.apache.hive.service.cli.thrift.TGetInfoResp.write(TGetInfoResp.java:341)
   at 
 org.apache.hive.service.cli.thrift.TCLIService$GetInfo_result$GetInfo_resultStandardScheme.write(TCLIService.java:3754)
   at 
 org.apache.hive.service.cli.thrift.TCLIService$GetInfo_result$GetInfo_resultStandardScheme.write(TCLIService.java:3718)
   at 
 org.apache.hive.service.cli.thrift.TCLIService$GetInfo_result.write(TCLIService.java:3669)
   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53)
   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
   at 
 org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:55)
   at 
 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-4237) add Manifest File for Maven building

2014-11-04 Thread wangfei (JIRA)
wangfei created SPARK-4237:
--

 Summary: add Manifest File for Maven building
 Key: SPARK-4237
 URL: https://issues.apache.org/jira/browse/SPARK-4237
 Project: Spark
  Issue Type: Bug
  Components: Build
Affects Versions: 1.1.0
Reporter: wangfei
 Fix For: 1.2.0


Running with spark sql jdbc/odbc, the output will be 

JackydeMacBook-Pro:spark1 jackylee$ bin/beeline 
Spark assembly has been built with Hive, including Datanucleus jars on classpath
Beeline version ??? by Apache Hive

we should add Manifest File for Maven building





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4237) add Manifest File for Maven building

2014-11-04 Thread wangfei (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14197715#comment-14197715
 ] 

wangfei commented on SPARK-4237:


The title is not correct, should be generate right Manifest File for maven 
building. Now the MF is use the guava's which leads to the issues as my PR 
described. 


 add Manifest File for Maven building
 

 Key: SPARK-4237
 URL: https://issues.apache.org/jira/browse/SPARK-4237
 Project: Spark
  Issue Type: Bug
  Components: Build
Affects Versions: 1.1.0
Reporter: wangfei
 Fix For: 1.2.0


 Running with spark sql jdbc/odbc, the output will be 
 JackydeMacBook-Pro:spark1 jackylee$ bin/beeline 
 Spark assembly has been built with Hive, including Datanucleus jars on 
 classpath
 Beeline version ??? by Apache Hive
 we should add Manifest File for Maven building



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-4237) Generate right Manifest File for maven building

2014-11-04 Thread wangfei (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-4237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangfei updated SPARK-4237:
---
Summary: Generate right Manifest File for maven building  (was: add 
Manifest File for Maven building)

 Generate right Manifest File for maven building
 ---

 Key: SPARK-4237
 URL: https://issues.apache.org/jira/browse/SPARK-4237
 Project: Spark
  Issue Type: Bug
  Components: Build
Affects Versions: 1.1.0
Reporter: wangfei
 Fix For: 1.2.0


 Running with spark sql jdbc/odbc, the output will be 
 JackydeMacBook-Pro:spark1 jackylee$ bin/beeline 
 Spark assembly has been built with Hive, including Datanucleus jars on 
 classpath
 Beeline version ??? by Apache Hive
 we should add Manifest File for Maven building



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-4191) move wrapperFor to HiveInspectors to reuse them

2014-11-01 Thread wangfei (JIRA)
wangfei created SPARK-4191:
--

 Summary: move wrapperFor to HiveInspectors to reuse them
 Key: SPARK-4191
 URL: https://issues.apache.org/jira/browse/SPARK-4191
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.1.0
Reporter: wangfei
 Fix For: 1.2.0


move wrapperFor in InsertIntoHiveTable to HiveInspectors to reuse them, this 
method can be reused when writing date with ObjectInspector(such as orc support)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-4191) move wrapperFor to HiveInspectors to reuse them

2014-11-01 Thread wangfei (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangfei updated SPARK-4191:
---
Issue Type: Improvement  (was: Bug)

 move wrapperFor to HiveInspectors to reuse them
 ---

 Key: SPARK-4191
 URL: https://issues.apache.org/jira/browse/SPARK-4191
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.1.0
Reporter: wangfei
 Fix For: 1.2.0


 move wrapperFor in InsertIntoHiveTable to HiveInspectors to reuse them, this 
 method can be reused when writing date with ObjectInspector(such as orc 
 support)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-3652) upgrade spark sql hive version to 0.13.1

2014-10-31 Thread wangfei (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangfei resolved SPARK-3652.

Resolution: Fixed

 upgrade spark sql hive version to 0.13.1
 

 Key: SPARK-3652
 URL: https://issues.apache.org/jira/browse/SPARK-3652
 Project: Spark
  Issue Type: Dependency upgrade
  Components: SQL
Affects Versions: 1.1.0
Reporter: wangfei

 now spark sql hive version is 0.12.0, compile with 0.13.1 will get errors. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-3322) ConnectionManager logs an error when the application ends

2014-10-31 Thread wangfei (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14192701#comment-14192701
 ] 

wangfei commented on SPARK-3322:


yes, to close this.

 ConnectionManager logs an error when the application ends
 -

 Key: SPARK-3322
 URL: https://issues.apache.org/jira/browse/SPARK-3322
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.1.0
Reporter: wangfei

 Athough it does not influence the result, it always would log an error from 
 ConnectionManager.
 Sometimes only log ConnectionManagerId(vm2,40992) not found and sometimes 
 it also will log CancelledKeyException
 The log Info as fellow:
 14/08/29 16:54:53 ERROR ConnectionManager: Corresponding SendingConnection to 
 ConnectionManagerId(vm2,40992) not found
 14/08/29 16:54:53 INFO ConnectionManager: key already cancelled ? 
 sun.nio.ch.SelectionKeyImpl@457245f9
 java.nio.channels.CancelledKeyException
 at 
 org.apache.spark.network.ConnectionManager.run(ConnectionManager.scala:386)
 at 
 org.apache.spark.network.ConnectionManager$$anon$4.run(ConnectionManager.scala:139)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Closed] (SPARK-2460) Optimize SparkContext.hadoopFile api

2014-10-31 Thread wangfei (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-2460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangfei closed SPARK-2460.
--
Resolution: Fixed

 Optimize SparkContext.hadoopFile api 
 -

 Key: SPARK-2460
 URL: https://issues.apache.org/jira/browse/SPARK-2460
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 1.0.0
Reporter: wangfei
 Fix For: 1.2.0


 1 use SparkContext.hadoopRDD() instead of instantiate HadoopRDD directly in 
 SparkContext.hadoopFile
 2 broadcast jobConf in HadoopRDD, not Configuration



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-4177) update build doc for JDBC/CLI already supporting hive 13

2014-10-31 Thread wangfei (JIRA)
wangfei created SPARK-4177:
--

 Summary: update build doc for JDBC/CLI already supporting hive 13
 Key: SPARK-4177
 URL: https://issues.apache.org/jira/browse/SPARK-4177
 Project: Spark
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.1.0
Reporter: wangfei
 Fix For: 1.2.0


fix build doc since already support hive 13 in jdbc/cli



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4001) Add Apriori algorithm to Spark MLlib

2014-10-21 Thread wangfei (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14178183#comment-14178183
 ] 

wangfei commented on SPARK-4001:




Thanks Sean Owen for explaining! Frequent itemset algorithm works by scanning 
the input data set, there is no probabilistic model in nature. 

To answer Xiangrui Meng’s earlier questions:
1.  These algorithm is used for finding major patterns / association rules 
in a data set. For a real use case, some analytic applications in telecom 
domain use them to find subscriber behavior from the data set combining service 
record, network traffic record, and demographic data. Please refer to this 
Chinese article for example:  http://www.ss-lw.com/wxxw-361.html
And, sometimes we use frequent itemset algorithm for preparing features input 
to other algorithm which selects feature and do other ML task like training a 
classifier, like this paper: http://dl.acm.org/citation.cfm?id=1401922, 

2.  Since Apriori is a basic algorithm for frequent itemset mining, I am 
not aware of any parallel implementation for it. But I think the algorithm fits 
Spark’s data parallel model since it only need to scan the input data set. And 
for FP-Growth, I do know there is a Parallel FP-Growth from Haoyuan Li: 
http://dl.acm.org/citation.cfm?id=1454027  . I think I probably will refer  to 
this paper to implement FP-Growth in Spark 

3.  The Apriori computation complexity is about O(N*k) where N is the 
number of item in input data and k is the depth of the frequent item tree to 
search. FP-Grwoth complexity is about O(N),  it is more efficient comparing to 
Apriori. For space efficiency, FP-growth is also more efficient than Apriori. 
But in case of smaller data and if frequent itemset is more, Apriori is more 
efficient. This is because FP-Growth need to construct a FP Tree out of the 
input data set, and it needs some time. And another advantage of Apriori is 
that it can output association rules while FP-Growth can not.

Although these two algorithms are basic algo (FP-Growth is more complex), I 
think it will be handy if mllib can include them since there is no frequent 
itemset mining algo in Spark yet, and especially in distributed environment. 
Please suggest how to handle this issue. Thanks a lot. 


 Add Apriori algorithm to Spark MLlib
 

 Key: SPARK-4001
 URL: https://issues.apache.org/jira/browse/SPARK-4001
 Project: Spark
  Issue Type: New Feature
  Components: MLlib
Reporter: Jacky Li
Assignee: Jacky Li

 Apriori is the classic algorithm for frequent item set mining in a 
 transactional data set.  It will be useful if Apriori algorithm is added to 
 MLLib in Spark



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-4001) Add Apriori algorithm to Spark MLlib

2014-10-21 Thread wangfei (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14178183#comment-14178183
 ] 

wangfei edited comment on SPARK-4001 at 10/21/14 9:38 AM:
--

.


was (Author: scwf):


Thanks Sean Owen for explaining! Frequent itemset algorithm works by scanning 
the input data set, there is no probabilistic model in nature. 

To answer Xiangrui Meng’s earlier questions:
1.  These algorithm is used for finding major patterns / association rules 
in a data set. For a real use case, some analytic applications in telecom 
domain use them to find subscriber behavior from the data set combining service 
record, network traffic record, and demographic data. Please refer to this 
Chinese article for example:  http://www.ss-lw.com/wxxw-361.html
And, sometimes we use frequent itemset algorithm for preparing features input 
to other algorithm which selects feature and do other ML task like training a 
classifier, like this paper: http://dl.acm.org/citation.cfm?id=1401922, 

2.  Since Apriori is a basic algorithm for frequent itemset mining, I am 
not aware of any parallel implementation for it. But I think the algorithm fits 
Spark’s data parallel model since it only need to scan the input data set. And 
for FP-Growth, I do know there is a Parallel FP-Growth from Haoyuan Li: 
http://dl.acm.org/citation.cfm?id=1454027  . I think I probably will refer  to 
this paper to implement FP-Growth in Spark 

3.  The Apriori computation complexity is about O(N*k) where N is the 
number of item in input data and k is the depth of the frequent item tree to 
search. FP-Grwoth complexity is about O(N),  it is more efficient comparing to 
Apriori. For space efficiency, FP-growth is also more efficient than Apriori. 
But in case of smaller data and if frequent itemset is more, Apriori is more 
efficient. This is because FP-Growth need to construct a FP Tree out of the 
input data set, and it needs some time. And another advantage of Apriori is 
that it can output association rules while FP-Growth can not.

Although these two algorithms are basic algo (FP-Growth is more complex), I 
think it will be handy if mllib can include them since there is no frequent 
itemset mining algo in Spark yet, and especially in distributed environment. 
Please suggest how to handle this issue. Thanks a lot. 


 Add Apriori algorithm to Spark MLlib
 

 Key: SPARK-4001
 URL: https://issues.apache.org/jira/browse/SPARK-4001
 Project: Spark
  Issue Type: New Feature
  Components: MLlib
Reporter: Jacky Li
Assignee: Jacky Li

 Apriori is the classic algorithm for frequent item set mining in a 
 transactional data set.  It will be useful if Apriori algorithm is added to 
 MLLib in Spark



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Issue Comment Deleted] (SPARK-4001) Add Apriori algorithm to Spark MLlib

2014-10-21 Thread wangfei (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-4001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangfei updated SPARK-4001:
---
Comment: was deleted

(was: .)

 Add Apriori algorithm to Spark MLlib
 

 Key: SPARK-4001
 URL: https://issues.apache.org/jira/browse/SPARK-4001
 Project: Spark
  Issue Type: New Feature
  Components: MLlib
Reporter: Jacky Li
Assignee: Jacky Li

 Apriori is the classic algorithm for frequent item set mining in a 
 transactional data set.  It will be useful if Apriori algorithm is added to 
 MLLib in Spark



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-4041) convert attributes names in table scan lowercase when compare with relation attributes

2014-10-21 Thread wangfei (JIRA)
wangfei created SPARK-4041:
--

 Summary: convert attributes names in table scan lowercase when 
compare with relation attributes
 Key: SPARK-4041
 URL: https://issues.apache.org/jira/browse/SPARK-4041
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.1.0
Reporter: wangfei
 Fix For: 1.1.1






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-4042) append columns ids and names before broadcast

2014-10-21 Thread wangfei (JIRA)
wangfei created SPARK-4042:
--

 Summary: append columns ids and names before broadcast
 Key: SPARK-4042
 URL: https://issues.apache.org/jira/browse/SPARK-4042
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.1.0
Reporter: wangfei
 Fix For: 1.1.1






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-4042) append columns ids and names before broadcast

2014-10-21 Thread wangfei (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-4042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangfei updated SPARK-4042:
---
Description: appended columns ids and names will not broadcast because we 
append them after create table reader

 append columns ids and names before broadcast
 -

 Key: SPARK-4042
 URL: https://issues.apache.org/jira/browse/SPARK-4042
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.1.0
Reporter: wangfei

 appended columns ids and names will not broadcast because we append them 
 after create table reader



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-4042) append columns ids and names before broadcast

2014-10-21 Thread wangfei (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-4042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangfei updated SPARK-4042:
---
Description: 
appended columns ids and names will not broadcast because we append them after 
create table reader. This leads to the config broadcasted to executor side dose 
not contain the configs of appended columns and names.


  was:
appended columns ids and names will not broadcast because we append them after 
create table reader. This leads to the config broadcasted to executor side dose 
not contain the configs of appended columns and names



 append columns ids and names before broadcast
 -

 Key: SPARK-4042
 URL: https://issues.apache.org/jira/browse/SPARK-4042
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.1.0
Reporter: wangfei

 appended columns ids and names will not broadcast because we append them 
 after create table reader. This leads to the config broadcasted to executor 
 side dose not contain the configs of appended columns and names.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-4042) append columns ids and names before broadcast

2014-10-21 Thread wangfei (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-4042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangfei updated SPARK-4042:
---
Description: 
appended columns ids and names will not broadcast because we append them after 
creating table reader. This leads to the config broadcasted to executor side 
dose not contain the configs of appended columns and names. 


  was:
appended columns ids and names will not broadcast because we append them after 
create table reader. This leads to the config broadcasted to executor side dose 
not contain the configs of appended columns and names.



 append columns ids and names before broadcast
 -

 Key: SPARK-4042
 URL: https://issues.apache.org/jira/browse/SPARK-4042
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.1.0
Reporter: wangfei

 appended columns ids and names will not broadcast because we append them 
 after creating table reader. This leads to the config broadcasted to executor 
 side dose not contain the configs of appended columns and names. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-4042) append columns ids and names before broadcast

2014-10-21 Thread wangfei (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-4042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangfei updated SPARK-4042:
---
Description: 
appended columns ids and names will not broadcast because we append them after 
create table reader. This leads to the config broadcasted to executor side dose 
not contain the configs of appended columns and names


  was:appended columns ids and names will not broadcast because we append them 
after create table reader


 append columns ids and names before broadcast
 -

 Key: SPARK-4042
 URL: https://issues.apache.org/jira/browse/SPARK-4042
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.1.0
Reporter: wangfei

 appended columns ids and names will not broadcast because we append them 
 after create table reader. This leads to the config broadcasted to executor 
 side dose not contain the configs of appended columns and names



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-3935) Unused variable in PairRDDFunctions.scala

2014-10-13 Thread wangfei (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-3935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangfei updated SPARK-3935:
---
Description: 
There is a unused variable (count) in saveAsHadoopDataset function in 
PairRDDFunctions.scala. 
It is better to add a log statement to record the line of output. 

  was:
There is a unused variable (count) in saveAsHadoopDataset function in 
PairRDDFunctions.scala. 
It is better to add a log statement to record the line of the read file. 


 Unused variable in PairRDDFunctions.scala
 -

 Key: SPARK-3935
 URL: https://issues.apache.org/jira/browse/SPARK-3935
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.1.0
Reporter: wangfei
Priority: Minor

 There is a unused variable (count) in saveAsHadoopDataset function in 
 PairRDDFunctions.scala. 
 It is better to add a log statement to record the line of output. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-3826) enable hive-thriftserver support hive-0.13.1

2014-10-10 Thread wangfei (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-3826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangfei updated SPARK-3826:
---
Affects Version/s: (was: 1.1.1)
   1.1.0

 enable hive-thriftserver support hive-0.13.1
 

 Key: SPARK-3826
 URL: https://issues.apache.org/jira/browse/SPARK-3826
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.1.0
Reporter: wangfei

 Now hive-thriftserver not support hive-0.13, to make it support both 0.12 and 
 0.13



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-3899) wrong links in streaming doc

2014-10-10 Thread wangfei (JIRA)
wangfei created SPARK-3899:
--

 Summary: wrong links in streaming doc
 Key: SPARK-3899
 URL: https://issues.apache.org/jira/browse/SPARK-3899
 Project: Spark
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.1.0
Reporter: wangfei






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-3809) make HiveThriftServer2Suite work correctly

2014-10-06 Thread wangfei (JIRA)
wangfei created SPARK-3809:
--

 Summary: make HiveThriftServer2Suite work correctly
 Key: SPARK-3809
 URL: https://issues.apache.org/jira/browse/SPARK-3809
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.1.0
Reporter: wangfei
 Fix For: 1.2.0


Currently HiveThriftServer2Suite is a fake one, actually HiveThriftServer not 
started there



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-3826) enable hive-thriftserver support hive-0.13.1

2014-10-06 Thread wangfei (JIRA)
wangfei created SPARK-3826:
--

 Summary: enable hive-thriftserver support hive-0.13.1
 Key: SPARK-3826
 URL: https://issues.apache.org/jira/browse/SPARK-3826
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.1.1
Reporter: wangfei


Now hive-thriftserver not support hive-0.13, to make it support both 0.12 and 
0.13



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Closed] (SPARK-3793) use hiveconf when parse hive ql

2014-10-06 Thread wangfei (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-3793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangfei closed SPARK-3793.
--
Resolution: Fixed

should fix it in #2241

 use hiveconf when parse hive ql
 ---

 Key: SPARK-3793
 URL: https://issues.apache.org/jira/browse/SPARK-3793
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.1.0
Reporter: wangfei

 Now in spark hive it parse sql use 
 def getAst(sql: String): ASTNode = ParseUtils.findRootNonNullToken((new 
 ParseDriver).parse(sql))
 this is ok in hive-0.12, but will lead to NPE version in hive-0.13
 So add hiveconf here to make it more general to compatible with hive-0.12 and 
 hive-0.13



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-3806) minor bug and exception in CliSuite

2014-10-05 Thread wangfei (JIRA)
wangfei created SPARK-3806:
--

 Summary: minor bug and exception in CliSuite
 Key: SPARK-3806
 URL: https://issues.apache.org/jira/browse/SPARK-3806
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.1.0
Reporter: wangfei


Clisuite throw exception as follows:
Exception in thread Thread-6 java.lang.IndexOutOfBoundsException: 6
at 
scala.collection.mutable.ResizableArray$class.apply(ResizableArray.scala:43)
at scala.collection.mutable.ArrayBuffer.apply(ArrayBuffer.scala:47)
at 
org.apache.spark.sql.hive.thriftserver.CliSuite.org$apache$spark$sql$hive$thriftserver$CliSuite$$captureOutput$1(CliSuite.scala:67)
at 
org.apache.spark.sql.hive.thriftserver.CliSuite$$anonfun$4.apply(CliSuite.scala:78)
at 
org.apache.spark.sql.hive.thriftserver.CliSuite$$anonfun$4.apply(CliSuite.scala:78)
at scala.sys.process.ProcessLogger$$anon$1.out(ProcessLogger.scala:96)
at 
scala.sys.process.BasicIO$$anonfun$processOutFully$1.apply(BasicIO.scala:135)
at 
scala.sys.process.BasicIO$$anonfun$processOutFully$1.apply(BasicIO.scala:135)
at scala.sys.process.BasicIO$.readFully$1(BasicIO.scala:175)
at scala.sys.process.BasicIO$.processLinesFully(BasicIO.scala:179)
at 
scala.sys.process.BasicIO$$anonfun$processFully$1.apply(BasicIO.scala:164)
at 
scala.sys.process.BasicIO$$anonfun$processFully$1.apply(BasicIO.scala:162)
at 
scala.sys.process.ProcessBuilderImpl$Simple$$anonfun$3.apply$mcV$sp(ProcessBuilderImpl.scala:73)
at scala.sys.process.ProcessImpl$Spawn$$anon$1.run(ProcessImpl.scala:22)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-3806) minor bug in CliSuite

2014-10-05 Thread wangfei (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-3806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangfei updated SPARK-3806:
---
Summary: minor bug in CliSuite  (was: minor bug and exception in CliSuite)

 minor bug in CliSuite
 -

 Key: SPARK-3806
 URL: https://issues.apache.org/jira/browse/SPARK-3806
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.1.0
Reporter: wangfei

 Clisuite throw exception as follows:
 Exception in thread Thread-6 java.lang.IndexOutOfBoundsException: 6
   at 
 scala.collection.mutable.ResizableArray$class.apply(ResizableArray.scala:43)
   at scala.collection.mutable.ArrayBuffer.apply(ArrayBuffer.scala:47)
   at 
 org.apache.spark.sql.hive.thriftserver.CliSuite.org$apache$spark$sql$hive$thriftserver$CliSuite$$captureOutput$1(CliSuite.scala:67)
   at 
 org.apache.spark.sql.hive.thriftserver.CliSuite$$anonfun$4.apply(CliSuite.scala:78)
   at 
 org.apache.spark.sql.hive.thriftserver.CliSuite$$anonfun$4.apply(CliSuite.scala:78)
   at scala.sys.process.ProcessLogger$$anon$1.out(ProcessLogger.scala:96)
   at 
 scala.sys.process.BasicIO$$anonfun$processOutFully$1.apply(BasicIO.scala:135)
   at 
 scala.sys.process.BasicIO$$anonfun$processOutFully$1.apply(BasicIO.scala:135)
   at scala.sys.process.BasicIO$.readFully$1(BasicIO.scala:175)
   at scala.sys.process.BasicIO$.processLinesFully(BasicIO.scala:179)
   at 
 scala.sys.process.BasicIO$$anonfun$processFully$1.apply(BasicIO.scala:164)
   at 
 scala.sys.process.BasicIO$$anonfun$processFully$1.apply(BasicIO.scala:162)
   at 
 scala.sys.process.ProcessBuilderImpl$Simple$$anonfun$3.apply$mcV$sp(ProcessBuilderImpl.scala:73)
   at scala.sys.process.ProcessImpl$Spawn$$anon$1.run(ProcessImpl.scala:22)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-3792) enable JavaHiveQLSuite

2014-10-04 Thread wangfei (JIRA)
wangfei created SPARK-3792:
--

 Summary: enable JavaHiveQLSuite
 Key: SPARK-3792
 URL: https://issues.apache.org/jira/browse/SPARK-3792
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.1.0
Reporter: wangfei






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-3793) add para hiveconf when parse hive ql

2014-10-04 Thread wangfei (JIRA)
wangfei created SPARK-3793:
--

 Summary: add para hiveconf when parse hive ql
 Key: SPARK-3793
 URL: https://issues.apache.org/jira/browse/SPARK-3793
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.1.0
Reporter: wangfei


Now in spark hive it parse sql use 
def getAst(sql: String): ASTNode = ParseUtils.findRootNonNullToken((new 
ParseDriver).parse(sql))
this is ok in hive-0.12, but will lead to NPE version in hive-0.13

So add hiveconf here to make it more general to compatible with hive-0.12 and 
hive-0.13



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-3793) use hiveconf when parse hive ql

2014-10-04 Thread wangfei (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-3793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangfei updated SPARK-3793:
---
Summary: use hiveconf when parse hive ql  (was: add para hiveconf when 
parse hive ql)

 use hiveconf when parse hive ql
 ---

 Key: SPARK-3793
 URL: https://issues.apache.org/jira/browse/SPARK-3793
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.1.0
Reporter: wangfei

 Now in spark hive it parse sql use 
 def getAst(sql: String): ASTNode = ParseUtils.findRootNonNullToken((new 
 ParseDriver).parse(sql))
 this is ok in hive-0.12, but will lead to NPE version in hive-0.13
 So add hiveconf here to make it more general to compatible with hive-0.12 and 
 hive-0.13



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-3793) add para hiveconf when parse hive ql

2014-10-04 Thread wangfei (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-3793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangfei updated SPARK-3793:
---
Summary: add para hiveconf when parse hive ql  (was: use hiveconf when 
parse hive ql)

 add para hiveconf when parse hive ql
 

 Key: SPARK-3793
 URL: https://issues.apache.org/jira/browse/SPARK-3793
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.1.0
Reporter: wangfei

 Now in spark hive it parse sql use 
 def getAst(sql: String): ASTNode = ParseUtils.findRootNonNullToken((new 
 ParseDriver).parse(sql))
 this is ok in hive-0.12, but will lead to NPE version in hive-0.13
 So add hiveconf here to make it more general to compatible with hive-0.12 and 
 hive-0.13



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-3793) use hiveconf when parse hive ql

2014-10-04 Thread wangfei (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-3793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangfei updated SPARK-3793:
---
Summary: use hiveconf when parse hive ql  (was: add para hiveconf when 
parse hive ql)

 use hiveconf when parse hive ql
 ---

 Key: SPARK-3793
 URL: https://issues.apache.org/jira/browse/SPARK-3793
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.1.0
Reporter: wangfei

 Now in spark hive it parse sql use 
 def getAst(sql: String): ASTNode = ParseUtils.findRootNonNullToken((new 
 ParseDriver).parse(sql))
 this is ok in hive-0.12, but will lead to NPE version in hive-0.13
 So add hiveconf here to make it more general to compatible with hive-0.12 and 
 hive-0.13



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-3765) add testing with sbt to doc

2014-10-02 Thread wangfei (JIRA)
wangfei created SPARK-3765:
--

 Summary: add testing with sbt to doc
 Key: SPARK-3765
 URL: https://issues.apache.org/jira/browse/SPARK-3765
 Project: Spark
  Issue Type: Improvement
Affects Versions: 1.1.0
Reporter: wangfei






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-3766) Snappy is also the default compression codec for broadcast variables

2014-10-02 Thread wangfei (JIRA)
wangfei created SPARK-3766:
--

 Summary: Snappy is also the default compression codec for 
broadcast variables
 Key: SPARK-3766
 URL: https://issues.apache.org/jira/browse/SPARK-3766
 Project: Spark
  Issue Type: Improvement
Affects Versions: 1.1.0
Reporter: wangfei






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-3766) Snappy is also the default compression codec for broadcast variables

2014-10-02 Thread wangfei (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-3766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangfei updated SPARK-3766:
---
Component/s: Documentation

 Snappy is also the default compression codec for broadcast variables
 

 Key: SPARK-3766
 URL: https://issues.apache.org/jira/browse/SPARK-3766
 Project: Spark
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 1.1.0
Reporter: wangfei





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-3765) add testing with sbt to doc

2014-10-02 Thread wangfei (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-3765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangfei updated SPARK-3765:
---
Component/s: Documentation

 add testing with sbt to doc
 ---

 Key: SPARK-3765
 URL: https://issues.apache.org/jira/browse/SPARK-3765
 Project: Spark
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 1.1.0
Reporter: wangfei





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-3755) Do not bind port 1 - 1024 to server in spark

2014-10-01 Thread wangfei (JIRA)
wangfei created SPARK-3755:
--

 Summary: Do not bind port 1 - 1024 to server in spark
 Key: SPARK-3755
 URL: https://issues.apache.org/jira/browse/SPARK-3755
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.1.0
Reporter: wangfei


Non-root user use port 1- 1024 to start jetty server will get the exception  
java.net.SocketException: Permission denied



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-3756) check exception is caused by an address-port collision when binding properly

2014-10-01 Thread wangfei (JIRA)
wangfei created SPARK-3756:
--

 Summary: check exception is caused by an address-port collision 
when binding properly
 Key: SPARK-3756
 URL: https://issues.apache.org/jira/browse/SPARK-3756
 Project: Spark
  Issue Type: Bug
Reporter: wangfei






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-3756) check exception is caused by an address-port collision when binding properly

2014-10-01 Thread wangfei (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-3756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangfei updated SPARK-3756:
---
Affects Version/s: 1.1.0

 check exception is caused by an address-port collision when binding properly
 

 Key: SPARK-3756
 URL: https://issues.apache.org/jira/browse/SPARK-3756
 Project: Spark
  Issue Type: Bug
Affects Versions: 1.1.0
Reporter: wangfei





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-3756) check exception is caused by an address-port collision when binding properly

2014-10-01 Thread wangfei (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-3756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangfei updated SPARK-3756:
---
 Description: a tiny bug in method  isBindCollision
Target Version/s: 1.2.0

 check exception is caused by an address-port collision when binding properly
 

 Key: SPARK-3756
 URL: https://issues.apache.org/jira/browse/SPARK-3756
 Project: Spark
  Issue Type: Bug
Affects Versions: 1.1.0
Reporter: wangfei

 a tiny bug in method  isBindCollision



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-3756) check exception is caused by an address-port collision properly

2014-10-01 Thread wangfei (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-3756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangfei updated SPARK-3756:
---
Summary: check exception is caused by an address-port collision properly  
(was: check exception is caused by an address-port collision when binding 
properly)

 check exception is caused by an address-port collision properly
 ---

 Key: SPARK-3756
 URL: https://issues.apache.org/jira/browse/SPARK-3756
 Project: Spark
  Issue Type: Bug
Affects Versions: 1.1.0
Reporter: wangfei

 a tiny bug in method  isBindCollision



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



  1   2   >