[jira] [Updated] (SPARK-31069) high cpu caused by chunksBeingTransferred in external shuffle service
[ https://issues.apache.org/jira/browse/SPARK-31069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoju Wu updated SPARK-31069: -- Description: "shuffle-chunk-fetch-handler-2-40" #250 daemon prio=5 os_prio=0 tid=0x02ac nid=0xb9b3 runnable [0x7ff20a1af000] java.lang.Thread.State: RUNNABLE at java.util.concurrent.ConcurrentHashMap$Traverser.advance(ConcurrentHashMap.java:3339) at java.util.concurrent.ConcurrentHashMap$ValueIterator.next(ConcurrentHashMap.java:3439) at org.apache.spark.network.server.OneForOneStreamManager.chunksBeingTransferred(OneForOneStreamManager.java:184) at org.apache.spark.network.server.ChunkFetchRequestHandler.channelRead0(ChunkFetchRequestHandler.java:85) at org.apache.spark.network.server.ChunkFetchRequestHandler.channelRead0(ChunkFetchRequestHandler.java:51) at org.spark_project.io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) at org.spark_project.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) at org.spark_project.io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:38) at org.spark_project.io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:353) at org.spark_project.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) at org.spark_project.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403) at org.spark_project.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463) at org.spark_project.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) at org.spark_project.io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138) at java.lang.Thread.run(Thread.java:748) "shuffle-chunk-fetch-handler-2-48" #235 daemon prio=5 os_prio=0 tid=0x7ff2302ec800 nid=0xb9ad runnable [0x7ff20a7b4000] java.lang.Thread.State: RUNNABLE at org.apache.spark.network.server.OneForOneStreamManager.chunksBeingTransferred(OneForOneStreamManager.java:186) at org.apache.spark.network.server.ChunkFetchRequestHandler.channelRead0(ChunkFetchRequestHandler.java:85) at org.apache.spark.network.server.ChunkFetchRequestHandler.channelRead0(ChunkFetchRequestHandler.java:51) at org.spark_project.io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) at org.spark_project.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) at org.spark_project.io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:38) at org.spark_project.io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:353) at org.spark_project.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) at org.spark_project.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403) at org.spark_project.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463) at org.spark_project.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) at org.spark_project.io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138) at java.lang.Thread.run(Thread.java:748) > high cpu caused by chunksBeingTransferred in external shuffle service > - > > Key: SPARK-31069 > URL: https://issues.apache.org/jira/browse/SPARK-31069 > Project: Spark > Issue Type: Improvement > Components: Shuffle >Affects Versions: 3.0.0 >Reporter: Xiaoju Wu >Priority: Major > > "shuffle-chunk-fetch-handler-2-40" #250 daemon prio=5 os_prio=0 > tid=0x02ac nid=0xb9b3 runnable [0x7ff20a1af000] >java.lang.Thread.State: RUNNABLE > at > java.util.concurrent.ConcurrentHashMap$Traverser.advance(ConcurrentHashMap.java:3339) > at > java.util.concurrent.ConcurrentHashMap$ValueIterator.next(ConcurrentHashMap.java:3439) > at > org.apache.spark.network.server.OneForOneStreamManager.chunksBeingTransferred(OneForOneStreamManager.java:184) > at > org.apache.spark.network.server.ChunkFetchRequestHandler.channelRead0(ChunkFetchRequestHandler.java:85) > at > org.apache.spark.network.server.ChunkFetchRequestHandler.channelRead0(ChunkFetchRequestHandler.java:51) > at > org.spark_project.io.netty
[jira] [Resolved] (SPARK-31002) Add version information to the configuration of Core
[ https://issues.apache.org/jira/browse/SPARK-31002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-31002. -- Fix Version/s: 3.1.0 Resolution: Fixed Issue resolved by pull request 27847 [https://github.com/apache/spark/pull/27847] > Add version information to the configuration of Core > > > Key: SPARK-31002 > URL: https://issues.apache.org/jira/browse/SPARK-31002 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 3.1.0 >Reporter: jiaan.geng >Priority: Major > Fix For: 3.1.0 > > > core/src/main/scala/org/apache/spark/internal/config/package.scala -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-31080) Bugs/missing functions in documents
[ https://issues.apache.org/jira/browse/SPARK-31080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17054260#comment-17054260 ] Viet commented on SPARK-31080: -- [~viirya] I mean this issue [SPARK-24035|https://issues.apache.org/jira/browse/SPARK-24035]. But now I realize Pivot is not a function so it was not list in the documentation. Is there any other document showing how to use it? I only found some mentions in tutorials and stackoverflow. > Bugs/missing functions in documents > --- > > Key: SPARK-31080 > URL: https://issues.apache.org/jira/browse/SPARK-31080 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.5 >Reporter: Viet >Priority: Minor > > In current document for SQL API, I noticed that there is no section for > `PIVOT` keyword, which was introduced from 2.4.0. > Is there a bug in `mkdocs`? > Docs: [https://spark.apache.org/docs/latest/api/sql/] > P/S: Not sure if this issue should be in here but I cannot found any other > place to put this it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-31009) Support json_object_keys function
[ https://issues.apache.org/jira/browse/SPARK-31009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17054251#comment-17054251 ] Takeshi Yamamuro commented on SPARK-31009: -- I saw you working on the implementations of new json functions. If you wanna make more PRs for these functions, (to make it easier to see your activity from other developers), I think you'd be better to list up all the json functions first in a parent Jira that you're planning to work on. Also, I think it might be worth documenting these functions somewhere just like the PostgreSQL one: [https://www.postgresql.org/docs/current/functions-json.html] > Support json_object_keys function > - > > Key: SPARK-31009 > URL: https://issues.apache.org/jira/browse/SPARK-31009 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 3.1.0 >Reporter: Rakesh Raushan >Priority: Major > > This function will return all the keys from outer json object. > > PostgreSQL -> [https://www.postgresql.org/docs/9.3/functions-json.html] > Mysql -> > [https://dev.mysql.com/doc/refman/8.0/en/json-function-reference.html] > MariaDB -> [https://mariadb.com/kb/en/json-functions/] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-30934) ML, GraphX 3.0 QA: Programming guide update and migration guide
[ https://issues.apache.org/jira/browse/SPARK-30934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-30934. -- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 27785 [https://github.com/apache/spark/pull/27785] > ML, GraphX 3.0 QA: Programming guide update and migration guide > --- > > Key: SPARK-30934 > URL: https://issues.apache.org/jira/browse/SPARK-30934 > Project: Spark > Issue Type: Sub-task > Components: Documentation, GraphX, ML, MLlib >Affects Versions: 3.0.0 >Reporter: zhengruifeng >Assignee: Huaxin Gao >Priority: Major > Fix For: 3.0.0 > > > Before the release, we need to update the MLlib and GraphX Programming > Guides. Updates will include: > * Add migration guide subsection. > ** Use the results of the QA audit JIRAs. > * Check phrasing, especially in main sections (for outdated items such as > "In this release, ...") -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-30934) ML, GraphX 3.0 QA: Programming guide update and migration guide
[ https://issues.apache.org/jira/browse/SPARK-30934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen reassigned SPARK-30934: Assignee: Huaxin Gao > ML, GraphX 3.0 QA: Programming guide update and migration guide > --- > > Key: SPARK-30934 > URL: https://issues.apache.org/jira/browse/SPARK-30934 > Project: Spark > Issue Type: Sub-task > Components: Documentation, GraphX, ML, MLlib >Affects Versions: 3.0.0 >Reporter: zhengruifeng >Assignee: Huaxin Gao >Priority: Major > > Before the release, we need to update the MLlib and GraphX Programming > Guides. Updates will include: > * Add migration guide subsection. > ** Use the results of the QA audit JIRAs. > * Check phrasing, especially in main sections (for outdated items such as > "In this release, ...") -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-31080) Bugs/missing functions in documents
[ https://issues.apache.org/jira/browse/SPARK-31080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17054232#comment-17054232 ] L. C. Hsieh commented on SPARK-31080: - Do you mean "PivotFirst"? Otherwise, I don't find pivot in SQL expressions. > Bugs/missing functions in documents > --- > > Key: SPARK-31080 > URL: https://issues.apache.org/jira/browse/SPARK-31080 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.5 >Reporter: Viet >Priority: Minor > > In current document for SQL API, I noticed that there is no section for > `PIVOT` keyword, which was introduced from 2.4.0. > Is there a bug in `mkdocs`? > Docs: [https://spark.apache.org/docs/latest/api/sql/] > P/S: Not sure if this issue should be in here but I cannot found any other > place to put this it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-31014) InMemoryStore: CountingRemoveIfForEach misses to remove key from parentToChildrenMap
[ https://issues.apache.org/jira/browse/SPARK-31014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-31014. --- Fix Version/s: 3.0.0 Assignee: Jungtaek Lim Resolution: Fixed This is resolved via the followings. - https://github.com/apache/spark/pull/27765 (master) - https://github.com/apache/spark/pull/27825 (branch-3.0) > InMemoryStore: CountingRemoveIfForEach misses to remove key from > parentToChildrenMap > > > Key: SPARK-31014 > URL: https://issues.apache.org/jira/browse/SPARK-31014 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.0.0 >Reporter: Jungtaek Lim >Assignee: Jungtaek Lim >Priority: Minor > Fix For: 3.0.0 > > > SPARK-30964 introduces the secondary index which defines the relationship > between parent - children and able to operate all children for given parent > faster. > This change is not applied to CountingRemoveIfForEach, so there's a chance > "countingRemoveAllByIndexValues" missed to remove key from > parentToChildrenMap. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-30879) Refine doc-building workflow
[ https://issues.apache.org/jira/browse/SPARK-30879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen reassigned SPARK-30879: Assignee: Nicholas Chammas > Refine doc-building workflow > > > Key: SPARK-30879 > URL: https://issues.apache.org/jira/browse/SPARK-30879 > Project: Spark > Issue Type: Improvement > Components: Documentation >Affects Versions: 3.1.0 >Reporter: Nicholas Chammas >Assignee: Nicholas Chammas >Priority: Minor > > There are a few rough edges in the workflow for building docs that could be > refined: > * sudo pip installing stuff > * no pinned versions of any doc dependencies > * using some deprecated options > * race condition with jekyll serve -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-30879) Refine doc-building workflow
[ https://issues.apache.org/jira/browse/SPARK-30879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-30879. -- Fix Version/s: 3.1.0 Resolution: Fixed Issue resolved by pull request 27534 [https://github.com/apache/spark/pull/27534] > Refine doc-building workflow > > > Key: SPARK-30879 > URL: https://issues.apache.org/jira/browse/SPARK-30879 > Project: Spark > Issue Type: Improvement > Components: Documentation >Affects Versions: 3.1.0 >Reporter: Nicholas Chammas >Assignee: Nicholas Chammas >Priority: Minor > Fix For: 3.1.0 > > > There are a few rough edges in the workflow for building docs that could be > refined: > * sudo pip installing stuff > * no pinned versions of any doc dependencies > * using some deprecated options > * race condition with jekyll serve -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-31012) Update ML 3.0 docs
[ https://issues.apache.org/jira/browse/SPARK-31012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen reassigned SPARK-31012: Assignee: Huaxin Gao > Update ML 3.0 docs > -- > > Key: SPARK-31012 > URL: https://issues.apache.org/jira/browse/SPARK-31012 > Project: Spark > Issue Type: Improvement > Components: Documentation, ML, PySpark >Affects Versions: 3.0.0 >Reporter: Huaxin Gao >Assignee: Huaxin Gao >Priority: Minor > > updating ML docs for 3.0 changes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-31012) Update ML 3.0 docs
[ https://issues.apache.org/jira/browse/SPARK-31012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-31012. -- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 27762 [https://github.com/apache/spark/pull/27762] > Update ML 3.0 docs > -- > > Key: SPARK-31012 > URL: https://issues.apache.org/jira/browse/SPARK-31012 > Project: Spark > Issue Type: Improvement > Components: Documentation, ML, PySpark >Affects Versions: 3.0.0 >Reporter: Huaxin Gao >Assignee: Huaxin Gao >Priority: Minor > Fix For: 3.0.0 > > > updating ML docs for 3.0 changes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-31080) Bugs/missing functions in documents
[ https://issues.apache.org/jira/browse/SPARK-31080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viet updated SPARK-31080: - Description: In current document for SQL API, I noticed that there is no section for `PIVOT` keyword, which was introduced from 2.4.0. Is there a bug in `mkdocs`? Docs: [https://spark.apache.org/docs/latest/api/sql/] P/S: Not sure if this issue should be in here but I cannot found any other place to put this it. was: In current document for SQL API, I noticed that there is no section for `PIVOT` keyword, which was introduced from 2.4.0. Is there a bug in `mkdocs`? P/S: Not sure if this issue should be in here but I cannot found any other place to put this it. > Bugs/missing functions in documents > --- > > Key: SPARK-31080 > URL: https://issues.apache.org/jira/browse/SPARK-31080 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.5 >Reporter: Viet >Priority: Minor > > In current document for SQL API, I noticed that there is no section for > `PIVOT` keyword, which was introduced from 2.4.0. > Is there a bug in `mkdocs`? > Docs: [https://spark.apache.org/docs/latest/api/sql/] > P/S: Not sure if this issue should be in here but I cannot found any other > place to put this it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-31080) Bugs/missing functions in documents
Viet created SPARK-31080: Summary: Bugs/missing functions in documents Key: SPARK-31080 URL: https://issues.apache.org/jira/browse/SPARK-31080 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.4.5 Reporter: Viet In current document for SQL API, I noticed that there is no section for `PIVOT` keyword, which was introduced from 2.4.0. Is there a bug in `mkdocs`? P/S: Not sure if this issue should be in here but I cannot found any other place to put this it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org