[GitHub] spark pull request #18711: [SPARK-21506][DOC]The description of "spark.execu...
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/18711#discussion_r129251166 --- Diff: docs/configuration.md --- @@ -1103,10 +1103,10 @@ Apart from these, the following properties are also available, and may be useful The number of cores to use on each executor. In standalone and Mesos coarse-grained modes, setting this -parameter allows an application to run multiple executors on the -same worker, provided that there are enough cores on that -worker. Otherwise, only one executor per application will run on -each worker. +parameter allows an application to launch multiple executors on the +same worker in one single schedule iteration, provided that there are enough +cores on that worker. Otherwise, only one executor per application +will be scheduled on each worker during one single schedule iteration. --- End diff -- Thanks. I agree with you. How to describe it will be better? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18702: [SPARK-21485][SQL][DOCS] Spark SQL documentation ...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18702#discussion_r129251081 --- Diff: sql/log4j.properties --- @@ -0,0 +1,24 @@ +# --- End diff -- Oh, actually, I think I can get rid of this. Let me try. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18730: [SPARK-21527][CORE] Use buffer limit in order to ...
GitHub user caneGuy opened a pull request: https://github.com/apache/spark/pull/18730 [SPARK-21527][CORE] Use buffer limit in order to use JAVA NIO Util's buffercache ## What changes were proposed in this pull request? Right now, ChunkedByteBuffer#writeFully do not slice bytes first.We observe code in java nio Util#getTemporaryDirectBuffer below: BufferCache cache = bufferCache.get(); ByteBuffer buf = cache.get(size); if (buf != null) { return buf; } else { // No suitable buffer in the cache so we need to allocate a new // one. To avoid the cache growing then we remove the first // buffer from the cache and free it. if (!cache.isEmpty()) { buf = cache.removeFirst(); free(buf); } return ByteBuffer.allocateDirect(size); } If we slice first with a fixed size, we can use buffer cache and only need to allocate at the first write call. Since we allocate new buffer, we can not control the free time of this buffer.This once cause memory issue in our production cluster. In this patch, i supply a new api which will slice with fixed size for buffer writing. ## How was this patch tested? Unit test and test in production. You can merge this pull request into a Git repository by running: $ git pull https://github.com/caneGuy/spark zhoukang/improve-chunkwrite Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/18730.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #18730 commit 7cbadc5e367a045dd70af4c85e4c17fd0ac3cba7 Author: zhoukangDate: 2017-07-25T09:44:46Z [SPARK][CORE] Slice write by channel --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18555: [SPARK-21353][CORE]add checkValue in spark.internal.conf...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18555 **[Test build #79930 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79930/testReport)** for PR 18555 at commit [`4c55f61`](https://github.com/apache/spark/commit/4c55f6112e87d1305994ce13bdc3216360f32ecb). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18555: [SPARK-21353][CORE]add checkValue in spark.internal.conf...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18555 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18555: [SPARK-21353][CORE]add checkValue in spark.internal.conf...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18555 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79930/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18702: [SPARK-21485][SQL][DOCS] Spark SQL documentation generat...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18702 @srowen, so, the current structure is like ... ``` . âââ docs âââ sql âââ create-docs.sh # Genenerates HTMLs. âââ gen-sql-markdown.py # Generaates markdown files. âââ mkdocs.yml # MkDocs configuration file. ``` After `cd sql && create-docs.sh` under `sql`: ``` . âââ docs âââ sql âââ create-docs.sh âââ gen-sql-markdown.py âââ mkdocs.yml âââ site# Generated HTML files. ``` After `cd docs && SKIP_SCALADOC=1 SKIP_PYTHONDOC=1 SKIP_RDOC=1 jekyll build` ``` . âââ docs â  âââ api â  âââ sql â  âââ site# Copied from ./sql/site âââ sql âââ create-docs.sh âââ gen-sql-markdown.py âââ mkdocs.yml âââ site ``` It is pretty easy to move files around in any case and I am fine with trying out to move multiple times to show how it looks like. So ... do you mean something like the one as below? ``` . âââ docs âââ sql   âââ bin â âââ create-docs.sh # Genenerates HTMLs. â âââ gen-sql-markdown.py # Generaates markdown files. âââ mkdocs.yml # MkDocs configuration file. ``` After `create-docs.sh` under `sql`: ``` . âââ docs â  âââ api â  âââ sql â  âââ site# Generated HTML files. âââ sql   âââ bin â âââ create-docs.sh â âââ gen-sql-markdown.py âââ mkdocs.yml ``` After `SKIP_SCALADOC=1 SKIP_PYTHONDOC=1 SKIP_RDOC=1 jekyll build` ``` . âââ docs â  âââ api â  âââ sql â  âââ site# Generated HTML files. âââ sql   âââ bin â âââ create-docs.sh â âââ gen-sql-markdown.py âââ mkdocs.yml ``` Or, would this maybe give you another idea? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18555: [SPARK-21353][CORE]add checkValue in spark.internal.conf...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18555 **[Test build #79932 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79932/testReport)** for PR 18555 at commit [`77a7ca4`](https://github.com/apache/spark/commit/77a7ca41aa08578935727bdfa7f24e519b6b73ad). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18555: [SPARK-21353][CORE]add checkValue in spark.internal.conf...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18555 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79932/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18555: [SPARK-21353][CORE]add checkValue in spark.internal.conf...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18555 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18702: [SPARK-21485][SQL][DOCS] Spark SQL documentation generat...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/18702 Oh I see, it's for consistency. Well OK stay consistent with R/Python then. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18513: [SPARK-13969][ML] Add FeatureHasher transformer
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/18513 Thanks @sethah @hhbyyh for the review. I updated the behavior doc string as suggested. Any other comments? cc @srowen @jkbradley @yanboliang --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18513: [SPARK-13969][ML] Add FeatureHasher transformer
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18513 **[Test build #79934 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79934/testReport)** for PR 18513 at commit [`a91b53f`](https://github.com/apache/spark/commit/a91b53f7482b8a05734e77f42491a70f1e3e77f1). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18731: [SPARK-20990][SQL] Read all JSON documents in fil...
GitHub user mgaido91 opened a pull request: https://github.com/apache/spark/pull/18731 [SPARK-20990][SQL] Read all JSON documents in files when multiline mode is on ## What changes were proposed in this pull request? The PR improves the JSON parsing so that now all the JSON documents in a file are read also when the `multiLine` option is turned on. ## How was this patch tested? A UT has been added to verify this patch. You can merge this pull request into a Git repository by running: $ git pull https://github.com/mgaido91/spark SPARK-20990 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/18731.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #18731 commit 972a28d44fcd3c457efc4b732ade47c548766a30 Author: Marco GaidoDate: 2017-07-25T08:30:34Z [SPARK-20990][SQL] Read all JSON documents in files when multiline mode is on --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18726: [MINOR][CORE][TEST]Repeat stop SparkContext in ExecutorA...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/18726 IIUC I think close context again which is already closed is fine to Spark, though neither your code nor the previous one can handle spark context creation failure related object leakage. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18728: [SPARK-21524] [ML] unit test fix: ValidatorParamsSuiteHe...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/18728 Hm, is the more basic error here that `object ValidatorParamsSuiteHelpers extends SparkFunSuite` ? it seems like it's trying to leverage the temp dir cleanup function in the superclass, but it will never be invoked because this object is never used as an actual test. What about removing both of those superclasses? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18711: [SPARK-21506][DOC]The description of "spark.execu...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/18711#discussion_r129248960 --- Diff: docs/configuration.md --- @@ -1103,10 +1103,10 @@ Apart from these, the following properties are also available, and may be useful The number of cores to use on each executor. In standalone and Mesos coarse-grained modes, setting this -parameter allows an application to run multiple executors on the -same worker, provided that there are enough cores on that -worker. Otherwise, only one executor per application will run on -each worker. +parameter allows an application to launch multiple executors on the +same worker in one single schedule iteration, provided that there are enough +cores on that worker. Otherwise, only one executor per application +will be scheduled on each worker during one single schedule iteration. --- End diff -- My hunch is that user may not know the actual meaning of "one single schedule iteration" even if you're trying to clarify the behavior here, unless users know the implementation here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18726: [MINOR][CORE][TEST]Repeat stop SparkContext in ExecutorA...
Github user heary-cao commented on the issue: https://github.com/apache/spark/pull/18726 @jerryshao I think close Sparkcontext again that is little significance. Because they are closed again. spark prompt `SparkContext already stopped.` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18651: [SPARK-21383][Core] Fix the YarnAllocator allocates more...
Github user djvulee commented on the issue: https://github.com/apache/spark/pull/18651 I update the code, please take a look at @vanzin @tgravescs --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18702: [SPARK-21485][SQL][DOCS] Spark SQL documentation generat...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18702 (I got rid of log4j file by avoiding directly accessing to JVM) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18723: [SPARK-21517][CORE] Avoid copying memory when transfer c...
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/18723 Thank you. 1. makes sense since memory is allocated by 2x at that point. While I looked at a [PR](https://github.com/netty/netty/pull/464) for netty, I cannot understand why 16 was used. I have another question to understand why it happens. Does this OOM occurs when any OpenBlocks message are received. Or, any specific scenario (e.g. receive a large message, a lot of multiple messages, or so on.) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18702: [SPARK-21485][SQL][DOCS] Spark SQL documentation generat...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18702 (sorry, I just edited the structure above to be more correct) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18723: [SPARK-21517][CORE] Avoid copying memory when transfer c...
Github user caneGuy commented on the issue: https://github.com/apache/spark/pull/18723 @kiszk Actually, i am confused with default value 16 too. Yes, it occurs in specific scenario.In our case, it was large size block data which caused this issue. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18702: [SPARK-21485][SQL][DOCS] Spark SQL documentation generat...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18702 Thank you @srowen. I will double check and clean up minor nits and will ping you within tomorrow. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18711: [SPARK-21506][DOC]The description of "spark.executor.cor...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18711 **[Test build #79925 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79925/testReport)** for PR 18711 at commit [`bc423d1`](https://github.com/apache/spark/commit/bc423d188aaef4febce382ed14e86f462ec853cb). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18702: [SPARK-21485][SQL][DOCS] Spark SQL documentation ...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18702#discussion_r129250088 --- Diff: sql/log4j.properties --- @@ -0,0 +1,24 @@ +# --- End diff -- This is used - https://github.com/apache/spark/pull/18702/files#diff-a4b1e8e0e72fd59bd246285a34b21a45R49 I hesitated to add this one but just added to make the infos quiet when a spark session is initialised. There is a similar one in R - https://github.com/apache/spark/blob/master/R/run-tests.sh#L26 and https://github.com/apache/spark/blob/master/R/log4j.properties but this case logs are quite small though. ```bash $ sh create-docs.sh ``` **Before** ``` Generating markdown files for SQL documentation. Generating HTML files for SQL documentation. INFO- Cleaning site directory INFO- Building documentation to directory: /.../spark/sql/site ``` **After** ``` Generating markdown files for SQL documentation. Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 17/07/25 18:02:57 INFO SparkContext: Running Spark version 2.3.0-SNAPSHOT 17/07/25 18:02:57 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 17/07/25 18:02:57 INFO SparkContext: Submitted application: GenSQLDocs 17/07/25 18:02:57 INFO SecurityManager: Changing view acls to: hyukjinkwon 17/07/25 18:02:57 INFO SecurityManager: Changing modify acls to: hyukjinkwon 17/07/25 18:02:57 INFO SecurityManager: Changing view acls groups to: 17/07/25 18:02:57 INFO SecurityManager: Changing modify acls groups to: 17/07/25 18:02:57 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hyukjinkwon); groups with view permissions: Set(); users with modify permissions: Set(hyukjinkwon); groups with modify permissions: Set() 17/07/25 18:02:58 INFO Utils: Successfully started service 'sparkDriver' on port 62695. 17/07/25 18:02:58 INFO SparkEnv: Registering MapOutputTracker 17/07/25 18:02:58 INFO SparkEnv: Registering BlockManagerMaster 17/07/25 18:02:58 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 17/07/25 18:02:58 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 17/07/25 18:02:58 INFO DiskBlockManager: Created local directory at /private/var/folders/9j/gf_c342d7d150mwrxvkqnc18gn/T/blockmgr-13909519-0864-4aa3-82fb-eba4b9d1e527 17/07/25 18:02:58 INFO MemoryStore: MemoryStore started with capacity 366.3 MB 17/07/25 18:02:58 INFO SparkEnv: Registering OutputCommitCoordinator 17/07/25 18:02:58 INFO Utils: Successfully started service 'SparkUI' on port 4040. 17/07/25 18:02:58 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://192.168.0.146:4040 17/07/25 18:02:58 INFO SparkContext: Added file file:/.../spark/sql/gen-sql-markdown.py at file:/.../spark/sql/gen-sql-markdown.py with timestamp 1500973378636 17/07/25 18:02:58 INFO Utils: Copying /.../spark/sql/gen-sql-markdown.py to /private/var/folders/9j/gf_c342d7d150mwrxvkqnc18gn/T/spark-0eb99795-6a5e-4101-8ed2-c55b3ba80173/userFiles-64210993-0ed8-4c05-85bd-83d13ae22831/gen-sql-markdown.py 17/07/25 18:02:58 INFO Executor: Starting executor ID driver on host localhost 17/07/25 18:02:58 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 62696. 17/07/25 18:02:58 INFO NettyBlockTransferService: Server created on 192.168.0.146:62696 17/07/25 18:02:58 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 17/07/25 18:02:58 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.0.146, 62696, None) 17/07/25 18:02:58 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.0.146:62696 with 366.3 MB RAM, BlockManagerId(driver, 192.168.0.146, 62696, None) 17/07/25 18:02:58 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.0.146, 62696, None) 17/07/25 18:02:58 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 192.168.0.146, 62696, None) 17/07/25 18:02:59 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('/.../spark/sql/_spark-warehouse'). 17/07/25 18:02:59 INFO SharedState: Warehouse path is '/.../spark/sql/_spark-warehouse'. 17/07/25 18:02:59 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint 17/07/25 18:02:59 INFO SparkUI: Stopped Spark web UI at http://192.168.0.146:4040 17/07/25 18:02:59 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 17/07/25 18:02:59 INFO MemoryStore:
[GitHub] spark issue #18722: [SPARK-21498][Examples] quick start -> one py demo have ...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18722 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18726: [MINOR][CORE][TEST]Repeat stop SparkContext in ExecutorA...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/18726 I don't see a significant overhead here by calling `stop()` again. I don't see a strong reason to change the code here AFAIK. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18711: [SPARK-21506][DOC]The description of "spark.executor.cor...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18711 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79925/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18711: [SPARK-21506][DOC]The description of "spark.executor.cor...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18711 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18702: [SPARK-21485][SQL][DOCS] Spark SQL documentation ...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/18702#discussion_r129245068 --- Diff: sql/gen-sql-markdown.py --- @@ -0,0 +1,96 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +#http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +import sys --- End diff -- Does it complicate things to put these files in some kind of `bin` directory under `sql`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18702: [SPARK-21485][SQL][DOCS] Spark SQL documentation ...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/18702#discussion_r129245196 --- Diff: docs/_plugins/copy_api_dirs.rb --- @@ -150,4 +150,31 @@ cp("../R/pkg/DESCRIPTION", "api") end + if not (ENV['SKIP_SQLDOC'] == '1') +# Build SQL API docs + +puts "Moving to project root and building API docs." +curr_dir = pwd +cd("..") --- End diff -- Rather than `cd`, is it possible to generate the output directly into `docs/` somewhere? maybe I miss why that's hard. It would avoid creating more temp output dirs in the `sql` folder --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18702: [SPARK-21485][SQL][DOCS] Spark SQL documentation ...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/18702#discussion_r129244802 --- Diff: sql/log4j.properties --- @@ -0,0 +1,24 @@ +# --- End diff -- Pardon, is a log4j.properties really needed here? what reads it? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18722: [SPARK-21498][Examples] quick start -> one py demo have ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18722 **[Test build #3851 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3851/testReport)** for PR 18722 at commit [`a31cbd8`](https://github.com/apache/spark/commit/a31cbd8aae238e324bcc7de111a7107c89e9a979). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18610: [SPARK-21386] ML LinearRegression supports warm start fr...
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/18610 There was a _lot_ of discussion around the `KMeans` initial model. But here we seem to be just ignoring certain issues. Such as, does the `initialModel` override the param settings (e.g. reg param, tolerance, iterations etc). Here it is just the coefficients that are used. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18651: [SPARK-21383][Core] Fix the YarnAllocator allocates more...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18651 **[Test build #79928 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79928/testReport)** for PR 18651 at commit [`7703494`](https://github.com/apache/spark/commit/77034944610c5973325bb3fd71ac9f153f59d32b). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18651: [SPARK-21383][Core] Fix the YarnAllocator allocates more...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18651 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79928/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18651: [SPARK-21383][Core] Fix the YarnAllocator allocates more...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18651 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18702: [SPARK-21485][SQL][DOCS] Spark SQL documentation generat...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18702 In any case, it is easy. I was just thinking of matching it up with, like, R and Python which are, up to my knowledge, something like the below: **R**: ``` . âââ docs âââ R âââ create-docs.sh # Genenerates HTMLs. ``` After `cd R && create-docs.sh`: ``` . âââ docs âââ R   âââ create-docs.sh âââ pkg   âââ html # Generated HTML files. ``` After `cd docs && jekyll build` ``` . âââ docs â  âââ api â  âââ R # Copied from ./R/pkg/html âââ R   âââ create-docs.sh âââ pkg   âââ html ``` **Python**: ``` . âââ docs âââ python âââ docs âââ Makefile # Genenerates HTMLs. ``` After `cd python && make html`: ``` . âââ docs âââ R âââ docs   âââ create-docs.sh âââ _build   âââ html # Generated HTML files. ``` After `cd docs && jekyll build` ``` . âââ docs â  âââ api â  âââ python# Copied from ./python/docs/_build/html âââ R   âââ create-docs.sh âââ pkg   âââ html ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18729: [SPARK-21526] [MLlib] Add support to ML LogisticRegressi...
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/18729 Hi there, thanks for this. However, please see #18610 and related JIRA tickets. The solution is slightly more complex and needs to be done in a more generic way. I suggest that this be updated once agreement on #18610 is reached, and makes use of the shared trait in that PR, etc. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18722: [SPARK-21498][Examples] quick start -> one py dem...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18722#discussion_r129244806 --- Diff: docs/quick-start.md --- @@ -421,16 +421,15 @@ $ YOUR_SPARK_HOME/bin/spark-submit \ Lines with a: 46, Lines with b: 23 {% endhighlight %} -If you have PySpark pip installed into your enviroment (e.g. `pip instal pyspark` you can run your application with the regular Python interpeter or use the provided spark-submit as you prefer. +If you have PySpark pip installed into your enviroment (e.g., `pip install pyspark` you can run your application with the regular Python interpreter or use the provided 'spark-submit' as you prefer. --- End diff -- others look good but for the last one ... `` pip install pyspark` `` -> `` pip install pyspark`) `` (missing closing parenthesis). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18555: [SPARK-21353][CORE]add checkValue in spark.internal.conf...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18555 **[Test build #79926 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79926/testReport)** for PR 18555 at commit [`26afb05`](https://github.com/apache/spark/commit/26afb05a78447590144def42a54d9f88095c78a5). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18555: [SPARK-21353][CORE]add checkValue in spark.internal.conf...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18555 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18555: [SPARK-21353][CORE]add checkValue in spark.internal.conf...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18555 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79926/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18555: [SPARK-21353][CORE]add checkValue in spark.internal.conf...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18555 **[Test build #79926 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79926/testReport)** for PR 18555 at commit [`26afb05`](https://github.com/apache/spark/commit/26afb05a78447590144def42a54d9f88095c78a5). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18503: [SPARK-21271][SQL] Ensure Unsafe.sizeInBytes is a multip...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18503 **[Test build #79929 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79929/testReport)** for PR 18503 at commit [`54be80e`](https://github.com/apache/spark/commit/54be80ef4849fddb3ff51c53a64173883d1ed026). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18555: [SPARK-21353][CORE]add checkValue in spark.internal.conf...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18555 **[Test build #79932 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79932/testReport)** for PR 18555 at commit [`77a7ca4`](https://github.com/apache/spark/commit/77a7ca41aa08578935727bdfa7f24e519b6b73ad). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18702: [SPARK-21485][SQL][DOCS] Spark SQL documentation generat...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/18702 Yeah the latter is what I had in mind but I don't know, is it hard? it does seem more natural to generate the output directly into its destination, I think, unless I'm overlooking something else the docs process does. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18711: [SPARK-21506][DOC]The description of "spark.execu...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/18711#discussion_r129246022 --- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala --- @@ -580,7 +580,13 @@ private[deploy] class Master( * The number of cores assigned to each executor is configurable. When this is explicitly set, * multiple executors from the same application may be launched on the same worker if the worker * has enough cores and memory. Otherwise, each executor grabs all the cores available on the - * worker by default, in which case only one executor may be launched on each worker. + * worker by default, in which case only one executor may be launched on each worker during one + * single schedule iteration. + * Note that when `spark.executor.cores` is not set, we may still launch multiple executors from + * the same application on the same worker. Consider appA and appB both have one executor running + * on worker1, and appA.coresLeft > 0, then appB is finished and release all its cores on worker1, + * thus for the next schedule iteration, appA launchs a new executor that grabs all the free cores --- End diff -- nit: launches --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18722: [SPARK-21498][Examples] quick start -> one py dem...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18722#discussion_r129248977 --- Diff: docs/quick-start.md --- @@ -421,7 +421,7 @@ $ YOUR_SPARK_HOME/bin/spark-submit \ Lines with a: 46, Lines with b: 23 {% endhighlight %} -If you have PySpark pip installed into your enviroment (e.g., `pip install pyspark` you can run your application with the regular Python interpreter or use the provided 'spark-submit' as you prefer. +If you have PySpark pip installed into your enviroment (e.g., `pip install pyspark`), you can run your application with the regular Python Interpreter or use the provided 'spark-submit' as you prefer. --- End diff -- `interpreter` was fine ... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18722: [SPARK-21498][Examples] quick start -> one py demo have ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18722 **[Test build #3851 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3851/testReport)** for PR 18722 at commit [`a31cbd8`](https://github.com/apache/spark/commit/a31cbd8aae238e324bcc7de111a7107c89e9a979). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18555: [SPARK-21353][CORE]add checkValue in spark.internal.conf...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18555 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18555: [SPARK-21353][CORE]add checkValue in spark.internal.conf...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18555 **[Test build #79927 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79927/testReport)** for PR 18555 at commit [`364cfc4`](https://github.com/apache/spark/commit/364cfc4d6c8efc2a260d2a7cd578d9099851e3d2). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18555: [SPARK-21353][CORE]add checkValue in spark.internal.conf...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18555 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79927/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18555: [SPARK-21353][CORE]add checkValue in spark.internal.conf...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18555 **[Test build #79930 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79930/testReport)** for PR 18555 at commit [`4c55f61`](https://github.com/apache/spark/commit/4c55f6112e87d1305994ce13bdc3216360f32ecb). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18711: [SPARK-21506][DOC]The description of "spark.executor.cor...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18711 **[Test build #79925 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79925/testReport)** for PR 18711 at commit [`bc423d1`](https://github.com/apache/spark/commit/bc423d188aaef4febce382ed14e86f462ec853cb). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18503: [SPARK-21271][SQL] Ensure Unsafe.sizeInBytes is a multip...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18503 **[Test build #79924 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79924/testReport)** for PR 18503 at commit [`0159701`](https://github.com/apache/spark/commit/01597010145f7769506cfa51c0450a2db779fc07). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18730: [SPARK-21527][CORE] Use buffer limit in order to use JAV...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18730 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14151: [SPARK-16496][SQL] Add wholetext as option for reading t...
Github user ScrapCodes commented on the issue: https://github.com/apache/spark/pull/14151 @sameeragarwal Do you think this change still makes sense? Can I improve it somehow? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18711: [SPARK-21506][DOC]The description of "spark.execu...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/18711#discussion_r129245415 --- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala --- @@ -580,7 +580,13 @@ private[deploy] class Master( * The number of cores assigned to each executor is configurable. When this is explicitly set, * multiple executors from the same application may be launched on the same worker if the worker * has enough cores and memory. Otherwise, each executor grabs all the cores available on the - * worker by default, in which case only one executor may be launched on each worker. + * worker by default, in which case only one executor may be launched on each worker during one + * single schedule iteration. + * Note that when `spark.executor.cores` is not set, we may still launch multiple executors from + * the same application on the same worker. Consider appA and appB both have one executor running + * on worker1, and appA.coresLeft > 0, then appB is finished and release all its cores on worker1, + * thus for the next schedule iteration, appA launchs a new executor that grabs all the free cores + * on worker1, therefore we get mulfiple executors from appA running on worker1. --- End diff -- nit: multiple. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18693: [SPARK-21491][GraphX] Enhance GraphX performance: breakO...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18693 **[Test build #3850 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3850/testReport)** for PR 18693 at commit [`0ae9cc5`](https://github.com/apache/spark/commit/0ae9cc52737c6864d8db95ac2cbfe7b2334c5e5c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18702: [SPARK-21485][SQL][DOCS] Spark SQL documentation ...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18702#discussion_r129251387 --- Diff: sql/gen-sql-markdown.py --- @@ -0,0 +1,96 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +#http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +import sys --- End diff -- It should be pretty simple. Let me try. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18555: [SPARK-21353][CORE]add checkValue in spark.internal.conf...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18555 **[Test build #79927 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79927/testReport)** for PR 18555 at commit [`364cfc4`](https://github.com/apache/spark/commit/364cfc4d6c8efc2a260d2a7cd578d9099851e3d2). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18503: [SPARK-21271][SQL] Ensure Unsafe.sizeInBytes is a...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/18503#discussion_r129257673 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala --- @@ -363,7 +363,8 @@ private[state] class HDFSBackedStateStoreProvider extends StateStoreProvider wit val valueRowBuffer = new Array[Byte](valueSize) ByteStreams.readFully(input, valueRowBuffer, 0, valueSize) val valueRow = new UnsafeRow(valueSchema.fields.length) -valueRow.pointTo(valueRowBuffer, valueSize) +// If valueSize in existing file is not multiple of 8, round it down to multiple of 8 +valueRow.pointTo(valueRowBuffer, (valueSize / 8) * 8) --- End diff -- Sure, added more comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18709: [SPARK-21504] [SQL] Add spark version info into table me...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18709 LGTM except one comment. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18702: [SPARK-21485][SQL][DOCS] Spark SQL documentation generat...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18702 I addressed the log4j comment for now first (I got rid of log4j file by avoiding directly accessing to JVM) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18693: [SPARK-21491][GraphX] Enhance GraphX performance: breakO...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18693 **[Test build #3850 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3850/testReport)** for PR 18693 at commit [`0ae9cc5`](https://github.com/apache/spark/commit/0ae9cc52737c6864d8db95ac2cbfe7b2334c5e5c). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18722: [SPARK-21498][Examples] quick start -> one py dem...
Github user lizhaoch commented on a diff in the pull request: https://github.com/apache/spark/pull/18722#discussion_r129246403 --- Diff: docs/quick-start.md --- @@ -421,16 +421,15 @@ $ YOUR_SPARK_HOME/bin/spark-submit \ Lines with a: 46, Lines with b: 23 {% endhighlight %} -If you have PySpark pip installed into your enviroment (e.g. `pip instal pyspark` you can run your application with the regular Python interpeter or use the provided spark-submit as you prefer. +If you have PySpark pip installed into your enviroment (e.g., `pip install pyspark` you can run your application with the regular Python interpreter or use the provided 'spark-submit' as you prefer. --- End diff -- submit again ï¼you are so careful,so good --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18702: [SPARK-21485][SQL][DOCS] Spark SQL documentation ...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18702#discussion_r129254786 --- Diff: docs/_plugins/copy_api_dirs.rb --- @@ -150,4 +150,31 @@ cp("../R/pkg/DESCRIPTION", "api") end + if not (ENV['SKIP_SQLDOC'] == '1') +# Build SQL API docs + +puts "Moving to project root and building API docs." +curr_dir = pwd +cd("..") --- End diff -- I think I misunderstood your previous comments initially. It shouldn't be hard. but other language API docs are, up to my knowledge, in separate directories and then copied into `docs/` later. I was thinking we could extend this further in the future (e.g., syntax documentation) and it could be easier to check doc output in a separate dir (actually, for me, I check other docs output in this way more often). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18503: [SPARK-21271][SQL] Ensure Unsafe.sizeInBytes is a multip...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18503 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18503: [SPARK-21271][SQL] Ensure Unsafe.sizeInBytes is a multip...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18503 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79924/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18709: [SPARK-21504] [SQL] Add spark version info into t...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/18709#discussion_r129259658 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala --- @@ -932,6 +934,7 @@ private[hive] object HiveClientImpl { table.storage.serde.getOrElse("org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe")) table.storage.properties.foreach { case (k, v) => hiveTable.setSerdeParam(k, v) } table.properties.foreach { case (k, v) => hiveTable.setProperty(k, v) } +hiveTable.setProperty(CREATED_SPARK_VERSION, table.createVersion) --- End diff -- seems better to do it in `HiveExternalCatalog`? this version string stuff is not related to hive. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18651: [SPARK-21383][Core] Fix the YarnAllocator allocates more...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18651 **[Test build #79928 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79928/testReport)** for PR 18651 at commit [`7703494`](https://github.com/apache/spark/commit/77034944610c5973325bb3fd71ac9f153f59d32b). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18702: [SPARK-21485][SQL][DOCS] Spark SQL documentation generat...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18702 **[Test build #79931 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79931/testReport)** for PR 18702 at commit [`c92533b`](https://github.com/apache/spark/commit/c92533b3e36eb5ff7437aeb1f55fb2ddaf5cc407). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18555: [SPARK-21353][CORE]add checkValue in spark.internal.conf...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18555 **[Test build #79933 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79933/testReport)** for PR 18555 at commit [`0970a89`](https://github.com/apache/spark/commit/0970a891d02865ef55d2d007a38bbe7ebae66877). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18731: [SPARK-20990][SQL] Read all JSON documents in files when...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18731 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18388 thanks, merging to master! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16765: [SPARK-19425][SQL] Make ExtractEquiJoinKeys support UDT ...
Github user joosterman commented on the issue: https://github.com/apache/spark/pull/16765 @alg-jmx Issue is fixed for Spark 2.2.0 according the ticket here: https://issues.apache.org/jira/browse/SPARK-19425 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18503: [SPARK-21271][SQL] Ensure Unsafe.sizeInBytes is a multip...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18503 **[Test build #79929 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79929/testReport)** for PR 18503 at commit [`54be80e`](https://github.com/apache/spark/commit/54be80ef4849fddb3ff51c53a64173883d1ed026). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18503: [SPARK-21271][SQL] Ensure Unsafe.sizeInBytes is a multip...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18503 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79929/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18726: [MINOR][CORE][TEST]Repeat stop SparkContext in Ex...
Github user heary-cao closed the pull request at: https://github.com/apache/spark/pull/18726 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18702: [SPARK-21485][SQL][DOCS] Spark SQL documentation generat...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18702 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79931/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18702: [SPARK-21485][SQL][DOCS] Spark SQL documentation generat...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18702 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18555: [SPARK-21353][CORE]add checkValue in spark.internal.conf...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18555 **[Test build #79933 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79933/testReport)** for PR 18555 at commit [`0970a89`](https://github.com/apache/spark/commit/0970a891d02865ef55d2d007a38bbe7ebae66877). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18731: [SPARK-20990][SQL] Read all JSON documents in files when...
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/18731 cc @gatorsmile --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17848: [SPARK-20586] [SQL] Add deterministic to ScalaUDF and Ja...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17848 **[Test build #79936 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79936/testReport)** for PR 17848 at commit [`bf060d6`](https://github.com/apache/spark/commit/bf060d66f62bcb54b7177af24a2dd7a9198b9864). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18513: [SPARK-13969][ML] Add FeatureHasher transformer
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18513 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18513: [SPARK-13969][ML] Add FeatureHasher transformer
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18513 **[Test build #79934 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79934/testReport)** for PR 18513 at commit [`a91b53f`](https://github.com/apache/spark/commit/a91b53f7482b8a05734e77f42491a70f1e3e77f1). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18513: [SPARK-13969][ML] Add FeatureHasher transformer
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18513 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79934/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18388: [SPARK-21175] Reject OpenBlocks when memory short...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/18388 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18722: [SPARK-21498][Examples] quick start -> one py dem...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/18722 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18702: [SPARK-21485][SQL][DOCS] Spark SQL documentation generat...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18702 **[Test build #79931 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79931/testReport)** for PR 18702 at commit [`c92533b`](https://github.com/apache/spark/commit/c92533b3e36eb5ff7437aeb1f55fb2ddaf5cc407). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17848: [SPARK-20586] [SQL] Add deterministic to ScalaUDF and Ja...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17848 **[Test build #79935 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79935/testReport)** for PR 17848 at commit [`1b3aa22`](https://github.com/apache/spark/commit/1b3aa22e07821b2303f3750470a5617b296ec317). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18388 @tgravescs Thanks for help. > I think we should expand the description of the config to say what happens when the limit is hit. Since its not using real flow control a user might set this thinking nothing bad will happen, but its dropping connections so could cause failures if the retries don't work. Could you give the link for the JIRA ? I'm happy to work on a follow-up PR if possible. For the flow control part, I'm just worrying the queue will be too large and causing memory issue. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18610: [SPARK-21386] ML LinearRegression supports warm s...
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18610#discussion_r129324027 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala --- @@ -365,7 +398,11 @@ class LinearRegression @Since("1.3.0") (@Since("1.3.0") override val uid: String new BreezeOWLQN[Int, BDV[Double]]($(maxIter), 10, effectiveL1RegFun, $(tol)) } -val initialCoefficients = Vectors.zeros(numFeatures) +val initialCoefficients = if (isSet(initialModel)) { + $(initialModel).coefficients --- End diff -- Here we only use ```coefficients```, not use ```intercept```. This is because the ```intercept``` was computed using closed form after the coefficients are converged. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18707: [SPARK-21503][UI]: Spark UI shows incorrect task status ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18707 **[Test build #79937 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79937/testReport)** for PR 18707 at commit [`81422e0`](https://github.com/apache/spark/commit/81422e0f634c0f06eb2ea29fba4281176a1ab528). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18707: [SPARK-21503][UI]: Spark UI shows incorrect task ...
Github user pgandhi999 commented on a diff in the pull request: https://github.com/apache/spark/pull/18707#discussion_r129330779 --- Diff: core/src/main/scala/org/apache/spark/ui/exec/ExecutorsTab.scala --- @@ -140,6 +140,8 @@ class ExecutorsListener(storageStatusListener: StorageStatusListener, conf: Spar return case _: ExceptionFailure => taskSummary.tasksFailed += 1 +case _: ExecutorLostFailure => --- End diff -- Hi, I have replaced most of the task failed cases with info.successful. Tested it as well. Have also written unit test to simulate my issue. Thank you. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18651: [SPARK-21383][Core] Fix the YarnAllocator allocates more...
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/18651 LGTM. @vanzin anything further? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17848: [SPARK-20586] [SQL] Add deterministic to ScalaUDF and Ja...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17848 let's leave the java UDF API unchanged and think about whether we should add java UDF API in `functions` later. @gatorsmile can you update the PR title? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18503: [SPARK-21271][SQL] Ensure Unsafe.sizeInBytes is a multip...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18503 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org