[jira] [Updated] (SPARK-31069) high cpu caused by chunksBeingTransferred in external shuffle service

2020-03-07 Thread Xiaoju Wu (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-31069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoju Wu updated SPARK-31069:
--
Description: 
"shuffle-chunk-fetch-handler-2-40" #250 daemon prio=5 os_prio=0 
tid=0x02ac nid=0xb9b3 runnable [0x7ff20a1af000]
   java.lang.Thread.State: RUNNABLE
at 
java.util.concurrent.ConcurrentHashMap$Traverser.advance(ConcurrentHashMap.java:3339)
at 
java.util.concurrent.ConcurrentHashMap$ValueIterator.next(ConcurrentHashMap.java:3439)
at 
org.apache.spark.network.server.OneForOneStreamManager.chunksBeingTransferred(OneForOneStreamManager.java:184)
at 
org.apache.spark.network.server.ChunkFetchRequestHandler.channelRead0(ChunkFetchRequestHandler.java:85)
at 
org.apache.spark.network.server.ChunkFetchRequestHandler.channelRead0(ChunkFetchRequestHandler.java:51)
at 
org.spark_project.io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at 
org.spark_project.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at 
org.spark_project.io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:38)
at 
org.spark_project.io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:353)
at 
org.spark_project.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at 
org.spark_project.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
at 
org.spark_project.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)
at 
org.spark_project.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at 
org.spark_project.io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
 
 
 
"shuffle-chunk-fetch-handler-2-48" #235 daemon prio=5 os_prio=0 
tid=0x7ff2302ec800 nid=0xb9ad runnable [0x7ff20a7b4000]
   java.lang.Thread.State: RUNNABLE
at 
org.apache.spark.network.server.OneForOneStreamManager.chunksBeingTransferred(OneForOneStreamManager.java:186)
at 
org.apache.spark.network.server.ChunkFetchRequestHandler.channelRead0(ChunkFetchRequestHandler.java:85)
at 
org.apache.spark.network.server.ChunkFetchRequestHandler.channelRead0(ChunkFetchRequestHandler.java:51)
at 
org.spark_project.io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at 
org.spark_project.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at 
org.spark_project.io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:38)
at 
org.spark_project.io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:353)
at 
org.spark_project.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at 
org.spark_project.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
at 
org.spark_project.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)
at 
org.spark_project.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at 
org.spark_project.io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)

> high cpu caused by chunksBeingTransferred in external shuffle service
> -
>
> Key: SPARK-31069
> URL: https://issues.apache.org/jira/browse/SPARK-31069
> Project: Spark
>  Issue Type: Improvement
>  Components: Shuffle
>Affects Versions: 3.0.0
>Reporter: Xiaoju Wu
>Priority: Major
>
> "shuffle-chunk-fetch-handler-2-40" #250 daemon prio=5 os_prio=0 
> tid=0x02ac nid=0xb9b3 runnable [0x7ff20a1af000]
>java.lang.Thread.State: RUNNABLE
> at 
> java.util.concurrent.ConcurrentHashMap$Traverser.advance(ConcurrentHashMap.java:3339)
> at 
> java.util.concurrent.ConcurrentHashMap$ValueIterator.next(ConcurrentHashMap.java:3439)
> at 
> org.apache.spark.network.server.OneForOneStreamManager.chunksBeingTransferred(OneForOneStreamManager.java:184)
> at 
> org.apache.spark.network.server.ChunkFetchRequestHandler.channelRead0(ChunkFetchRequestHandler.java:85)
> at 
> org.apache.spark.network.server.ChunkFetchRequestHandler.channelRead0(ChunkFetchRequestHandler.java:51)
> at 
> org.spark_project.io.netty

[jira] [Resolved] (SPARK-31002) Add version information to the configuration of Core

2020-03-07 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-31002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-31002.
--
Fix Version/s: 3.1.0
   Resolution: Fixed

Issue resolved by pull request 27847
[https://github.com/apache/spark/pull/27847]

> Add version information to the configuration of Core
> 
>
> Key: SPARK-31002
> URL: https://issues.apache.org/jira/browse/SPARK-31002
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 3.1.0
>Reporter: jiaan.geng
>Priority: Major
> Fix For: 3.1.0
>
>
> core/src/main/scala/org/apache/spark/internal/config/package.scala



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-31080) Bugs/missing functions in documents

2020-03-07 Thread Viet (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-31080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17054260#comment-17054260
 ] 

Viet commented on SPARK-31080:
--

[~viirya] I mean this issue 
[SPARK-24035|https://issues.apache.org/jira/browse/SPARK-24035].

But now I realize Pivot is not a function so it was not list in the 
documentation. Is there any other document showing how to use it? I only found 
some mentions in tutorials and stackoverflow.

> Bugs/missing functions in documents
> ---
>
> Key: SPARK-31080
> URL: https://issues.apache.org/jira/browse/SPARK-31080
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.5
>Reporter: Viet
>Priority: Minor
>
> In current document for SQL API, I noticed that there is no section for 
> `PIVOT` keyword, which was introduced from 2.4.0.
> Is there a bug in `mkdocs`? 
> Docs: [https://spark.apache.org/docs/latest/api/sql/]
> P/S: Not sure if this issue should be in here but I cannot found any other 
> place to put this it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-31009) Support json_object_keys function

2020-03-07 Thread Takeshi Yamamuro (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-31009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17054251#comment-17054251
 ] 

Takeshi Yamamuro commented on SPARK-31009:
--

I saw you working on the implementations of new json functions. If you wanna 
make more PRs for these functions, (to make it easier to see your activity from 
other developers), I think you'd be better to list up all the json functions 
first in a parent Jira that you're planning to work on. Also, I think it might 
be worth documenting these functions somewhere just like the PostgreSQL one: 
[https://www.postgresql.org/docs/current/functions-json.html]

> Support json_object_keys function
> -
>
> Key: SPARK-31009
> URL: https://issues.apache.org/jira/browse/SPARK-31009
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 3.1.0
>Reporter: Rakesh Raushan
>Priority: Major
>
> This function will return all the keys from outer json object.
>  
> PostgreSQL  -> [https://www.postgresql.org/docs/9.3/functions-json.html]
> Mysql -> 
> [https://dev.mysql.com/doc/refman/8.0/en/json-function-reference.html]
> MariaDB -> [https://mariadb.com/kb/en/json-functions/]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-30934) ML, GraphX 3.0 QA: Programming guide update and migration guide

2020-03-07 Thread Sean R. Owen (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean R. Owen resolved SPARK-30934.
--
Fix Version/s: 3.0.0
   Resolution: Fixed

Issue resolved by pull request 27785
[https://github.com/apache/spark/pull/27785]

> ML, GraphX 3.0 QA: Programming guide update and migration guide
> ---
>
> Key: SPARK-30934
> URL: https://issues.apache.org/jira/browse/SPARK-30934
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, GraphX, ML, MLlib
>Affects Versions: 3.0.0
>Reporter: zhengruifeng
>Assignee: Huaxin Gao
>Priority: Major
> Fix For: 3.0.0
>
>
> Before the release, we need to update the MLlib and GraphX Programming 
> Guides. Updates will include:
>  * Add migration guide subsection.
>  ** Use the results of the QA audit JIRAs.
>  * Check phrasing, especially in main sections (for outdated items such as 
> "In this release, ...")



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-30934) ML, GraphX 3.0 QA: Programming guide update and migration guide

2020-03-07 Thread Sean R. Owen (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean R. Owen reassigned SPARK-30934:


Assignee: Huaxin Gao

> ML, GraphX 3.0 QA: Programming guide update and migration guide
> ---
>
> Key: SPARK-30934
> URL: https://issues.apache.org/jira/browse/SPARK-30934
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, GraphX, ML, MLlib
>Affects Versions: 3.0.0
>Reporter: zhengruifeng
>Assignee: Huaxin Gao
>Priority: Major
>
> Before the release, we need to update the MLlib and GraphX Programming 
> Guides. Updates will include:
>  * Add migration guide subsection.
>  ** Use the results of the QA audit JIRAs.
>  * Check phrasing, especially in main sections (for outdated items such as 
> "In this release, ...")



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-31080) Bugs/missing functions in documents

2020-03-07 Thread L. C. Hsieh (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-31080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17054232#comment-17054232
 ] 

L. C. Hsieh commented on SPARK-31080:
-

Do you mean "PivotFirst"? Otherwise, I don't find pivot in SQL expressions.

> Bugs/missing functions in documents
> ---
>
> Key: SPARK-31080
> URL: https://issues.apache.org/jira/browse/SPARK-31080
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.5
>Reporter: Viet
>Priority: Minor
>
> In current document for SQL API, I noticed that there is no section for 
> `PIVOT` keyword, which was introduced from 2.4.0.
> Is there a bug in `mkdocs`? 
> Docs: [https://spark.apache.org/docs/latest/api/sql/]
> P/S: Not sure if this issue should be in here but I cannot found any other 
> place to put this it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-31014) InMemoryStore: CountingRemoveIfForEach misses to remove key from parentToChildrenMap

2020-03-07 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-31014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-31014.
---
Fix Version/s: 3.0.0
 Assignee: Jungtaek Lim
   Resolution: Fixed

This is resolved via the followings.
- https://github.com/apache/spark/pull/27765 (master)
- https://github.com/apache/spark/pull/27825 (branch-3.0)

> InMemoryStore: CountingRemoveIfForEach misses to remove key from 
> parentToChildrenMap
> 
>
> Key: SPARK-31014
> URL: https://issues.apache.org/jira/browse/SPARK-31014
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Jungtaek Lim
>Assignee: Jungtaek Lim
>Priority: Minor
> Fix For: 3.0.0
>
>
> SPARK-30964 introduces the secondary index which defines the relationship 
> between parent - children and able to operate all children for given parent 
> faster.
> This change is not applied to CountingRemoveIfForEach, so there's a chance 
> "countingRemoveAllByIndexValues" missed to remove key from 
> parentToChildrenMap.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-30879) Refine doc-building workflow

2020-03-07 Thread Sean R. Owen (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean R. Owen reassigned SPARK-30879:


Assignee: Nicholas Chammas

> Refine doc-building workflow
> 
>
> Key: SPARK-30879
> URL: https://issues.apache.org/jira/browse/SPARK-30879
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 3.1.0
>Reporter: Nicholas Chammas
>Assignee: Nicholas Chammas
>Priority: Minor
>
> There are a few rough edges in the workflow for building docs that could be 
> refined:
>  * sudo pip installing stuff
>  * no pinned versions of any doc dependencies
>  * using some deprecated options
>  * race condition with jekyll serve



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-30879) Refine doc-building workflow

2020-03-07 Thread Sean R. Owen (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean R. Owen resolved SPARK-30879.
--
Fix Version/s: 3.1.0
   Resolution: Fixed

Issue resolved by pull request 27534
[https://github.com/apache/spark/pull/27534]

> Refine doc-building workflow
> 
>
> Key: SPARK-30879
> URL: https://issues.apache.org/jira/browse/SPARK-30879
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 3.1.0
>Reporter: Nicholas Chammas
>Assignee: Nicholas Chammas
>Priority: Minor
> Fix For: 3.1.0
>
>
> There are a few rough edges in the workflow for building docs that could be 
> refined:
>  * sudo pip installing stuff
>  * no pinned versions of any doc dependencies
>  * using some deprecated options
>  * race condition with jekyll serve



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-31012) Update ML 3.0 docs

2020-03-07 Thread Sean R. Owen (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-31012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean R. Owen reassigned SPARK-31012:


Assignee: Huaxin Gao

> Update ML 3.0 docs
> --
>
> Key: SPARK-31012
> URL: https://issues.apache.org/jira/browse/SPARK-31012
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation, ML, PySpark
>Affects Versions: 3.0.0
>Reporter: Huaxin Gao
>Assignee: Huaxin Gao
>Priority: Minor
>
> updating ML docs for 3.0 changes. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-31012) Update ML 3.0 docs

2020-03-07 Thread Sean R. Owen (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-31012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean R. Owen resolved SPARK-31012.
--
Fix Version/s: 3.0.0
   Resolution: Fixed

Issue resolved by pull request 27762
[https://github.com/apache/spark/pull/27762]

> Update ML 3.0 docs
> --
>
> Key: SPARK-31012
> URL: https://issues.apache.org/jira/browse/SPARK-31012
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation, ML, PySpark
>Affects Versions: 3.0.0
>Reporter: Huaxin Gao
>Assignee: Huaxin Gao
>Priority: Minor
> Fix For: 3.0.0
>
>
> updating ML docs for 3.0 changes. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-31080) Bugs/missing functions in documents

2020-03-07 Thread Viet (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-31080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viet updated SPARK-31080:
-
Description: 
In current document for SQL API, I noticed that there is no section for `PIVOT` 
keyword, which was introduced from 2.4.0.

Is there a bug in `mkdocs`? 

Docs: [https://spark.apache.org/docs/latest/api/sql/]

P/S: Not sure if this issue should be in here but I cannot found any other 
place to put this it.

  was:
In current document for SQL API, I noticed that there is no section for `PIVOT` 
keyword, which was introduced from 2.4.0.

Is there a bug in `mkdocs`? 

P/S: Not sure if this issue should be in here but I cannot found any other 
place to put this it.


> Bugs/missing functions in documents
> ---
>
> Key: SPARK-31080
> URL: https://issues.apache.org/jira/browse/SPARK-31080
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.5
>Reporter: Viet
>Priority: Minor
>
> In current document for SQL API, I noticed that there is no section for 
> `PIVOT` keyword, which was introduced from 2.4.0.
> Is there a bug in `mkdocs`? 
> Docs: [https://spark.apache.org/docs/latest/api/sql/]
> P/S: Not sure if this issue should be in here but I cannot found any other 
> place to put this it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-31080) Bugs/missing functions in documents

2020-03-07 Thread Viet (Jira)
Viet created SPARK-31080:


 Summary: Bugs/missing functions in documents
 Key: SPARK-31080
 URL: https://issues.apache.org/jira/browse/SPARK-31080
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.4.5
Reporter: Viet


In current document for SQL API, I noticed that there is no section for `PIVOT` 
keyword, which was introduced from 2.4.0.

Is there a bug in `mkdocs`? 

P/S: Not sure if this issue should be in here but I cannot found any other 
place to put this it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org