[GitHub] spark issue #22683: [SPARK-25696] The storage memory displayed on spark Appl...
Github user httfighter commented on the issue: https://github.com/apache/spark/pull/22683 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22683: [SPARK-25696] The storage memory displayed on spark Appl...
Github user httfighter commented on the issue: https://github.com/apache/spark/pull/22683 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22683: [SPARK-25696] The storage memory displayed on spark Appl...
Github user httfighter commented on the issue: https://github.com/apache/spark/pull/22683 @srowen Yes. I agree with you! These places should be consistent, otherwise it is easy to be confused. I will try to modify log statements and docs. Should I modify it in this PR or a new one? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22683: [SPARK-25696] The storage memory displayed on spa...
Github user httfighter commented on a diff in the pull request: https://github.com/apache/spark/pull/22683#discussion_r238215152 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -1164,17 +1164,17 @@ private[spark] object Utils extends Logging { } else { val (value, unit) = { if (size >= 2 * EB) { - (BigDecimal(size) / EB, "EB") + (BigDecimal(size) / EB, "EiB") --- End diff -- OK! I have submitted a modification. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22683: [SPARK-25696] The storage memory displayed on spark Appl...
Github user httfighter commented on the issue: https://github.com/apache/spark/pull/22683 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22683: [SPARK-25696] The storage memory displayed on spark Appl...
Github user httfighter commented on the issue: https://github.com/apache/spark/pull/22683 @srowen Sorry, I just saw your message. I am a little busy on weekdays. I will try to modify the test cases in recent days. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22683: [SPARK-25696] The storage memory displayed on spa...
GitHub user httfighter reopened a pull request: https://github.com/apache/spark/pull/22683 [SPARK-25696] The storage memory displayed on spark Application UI is⦠⦠incorrect. ## What changes were proposed in this pull request? In the reported heartbeat information, the unit of the memory data is bytes, which is converted by the formatBytes() function in the utils.js file before being displayed in the interface. The cardinality of the unit conversion in the formatBytes function is 1000, which should be 1024. Change the cardinality of the unit conversion in the formatBytes function to 1024. ## How was this patch tested? manual tests Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/httfighter/spark SPARK-25696 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22683.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22683 commit 9e45697296039e55e85dd204788e287c9c60fceb Author: é©ç°ç°00222924 Date: 2018-10-10T06:47:36Z [SPARK-25696] The storage memory displayed on spark Application UI is incorrect. commit 3bf6ca58904f4f1d363e8505bd9d14e5aad0ebd7 Author: é©ç°ç°00222924 Date: 2018-11-24T08:53:12Z Supplement the modification of the memory unit displayed on the UI --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22683: [SPARK-25696] The storage memory displayed on spark Appl...
Github user httfighter commented on the issue: https://github.com/apache/spark/pull/22683 @srowen@ajbozarth I have added the changes, could you help me review the code? Thank you very much. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22683: [SPARK-25696] The storage memory displayed on spa...
Github user httfighter closed the pull request at: https://github.com/apache/spark/pull/22683 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22683: [SPARK-25696] The storage memory displayed on spark Appl...
Github user httfighter commented on the issue: https://github.com/apache/spark/pull/22683 @srowen OK. Thank you very much for your advice. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22683: [SPARK-25696] The storage memory displayed on spark Appl...
Github user httfighter commented on the issue: https://github.com/apache/spark/pull/22683 @srowen @ajbozarth I am not sure about some things, can you give me some advice? In the process of modification, I have a question. In Spark, whether M and MB represent MiB. Spark does not use the unit of kilobytes to convert between numbers. private static final ImmutableMap byteSuffixes = ImmutableMap.builder() .put("b", ByteUnit.BYTE) .put("k", ByteUnit.KiB) .put("kb", ByteUnit.KiB) .put("m", ByteUnit.MiB) .put("mb", ByteUnit.MiB) .put("g", ByteUnit.GiB) .put("gb", ByteUnit.GiB) .put("t", ByteUnit.TiB) .put("tb", ByteUnit.TiB) .put("p", ByteUnit.PiB) .put("pb", ByteUnit.PiB) .build(); spark.kryoserializer.buffer 64k Initial size of Kryo's serialization buffer, in KiB unless otherwise specified. Note that there will be one buffer per core on each worker. This buffer will grow up to spark.kryoserializer.buffer.max if needed. If this is the case, can we only guarantee the uniform use of 1024 for digital conversion, no changes to the unit displays in log, UI, comments and configured messages. Otherwise, we need to modify all the UI, log, comments, and configuration information to ensure consistency, there is no guarantee that all can be modified, and there will be no problems after the modification. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22683: [SPARK-25696] The storage memory displayed on spark Appl...
Github user httfighter commented on the issue: https://github.com/apache/spark/pull/22683 @srowen Thank you for your review. I agree with you, and I will make changes in the near future. @wangyum Thank you for your help. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22683: [SPARK-25696] The storage memory displayed on spark Appl...
Github user httfighter commented on the issue: https://github.com/apache/spark/pull/22683 It's ok. @ajbozarth --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22683: [SPARK-25696] The storage memory displayed on spa...
GitHub user httfighter opened a pull request: https://github.com/apache/spark/pull/22683 [SPARK-25696] The storage memory displayed on spark Application UI is⦠⦠incorrect. ## What changes were proposed in this pull request? Change the cardinality of the unit conversion in the formatBytes function to 1024. ## How was this patch tested? manual tests Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/httfighter/spark SPARK-25696 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22683.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22683 commit 9e45697296039e55e85dd204788e287c9c60fceb Author: é©ç°ç°00222924 Date: 2018-10-10T06:47:36Z [SPARK-25696] The storage memory displayed on spark Application UI is incorrect. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22487: [SPARK-25477] “INSERT OVERWRITE LOCAL DIRECTORY”, ...
Github user httfighter commented on the issue: https://github.com/apache/spark/pull/22487 @wangyum In Hive, the INSERT OVERWRITE LOCAL DIRECTORY It does not use a local staging directory but uses a distributed staging directory. It does not have this problem in Hive. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22487: [SPARK-25477] “INSERT OVERWRITE LOCAL DIRECTORY...
GitHub user httfighter opened a pull request: https://github.com/apache/spark/pull/22487 [SPARK-25477] âINSERT OVERWRITE LOCAL DIRECTORYâï¼ the data files allo⦠â¦cated on the non-driver node will not be written to the specified output directory ## What changes were proposed in this pull request? As The "INSERT OVERWRITE LOCAL DIRECTORY" features use the local staging directory to load data into the specified output directory , the data files allocated on the non-driver node will not be written to the specified output directory. In saveAsHiveFile.scala, the code is based on the output directory to determine whether to use the local staging directory or the distributed staging directory. I change the getStagingDir() method. Modify the first parameter from " new Path(extURI.getScheme, extURI.getAuthority, extURI.getPath) " to "new Path(extURI.getPath)" If spark depends on the distributed storage system, then it will be used first. If it is not, it will be used locally. You can directly adjust it to let it be automatically selected instead of specifying it according to the output directory. ## How was this patch tested? manual tests Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/httfighter/spark SPARK-25477 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22487.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22487 commit 8fe6d095fd2ce1a1a129a46345b1cecf6df70d8c Author: é©ç°ç°00222924 Date: 2018-09-20T07:57:06Z [SPARK-25477] âINSERT OVERWRITE LOCAL DIRECTORYâï¼ the data files allocated on the non-driver node will not be written to the specified output directory --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21826: [SPARK-24872] Replace the symbol '||' of Or operator wit...
Github user httfighter commented on the issue: https://github.com/apache/spark/pull/21826 It failed again. I don't know what the problem is. Could you help me trigger it again?@viirya --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21826: [SPARK-24872] Replace the symbol '||' of Or operator wit...
Github user httfighter commented on the issue: https://github.com/apache/spark/pull/21826 Thank you very much! @viirya --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21826: [SPARK-24872] Replace the symbol '||' of Or operator wit...
Github user httfighter commented on the issue: https://github.com/apache/spark/pull/21826 The last test bulid failed, but all the test cases passed. I don't know what the problem is. Could you help me trigger it again? @HyukjinKwon --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21826: [SPARK-24872] Replace the symbol '||' of Or opera...
Github user httfighter commented on a diff in the pull request: https://github.com/apache/spark/pull/21826#discussion_r205934501 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/PredicateSuite.scala --- @@ -455,4 +456,10 @@ class PredicateSuite extends SparkFunSuite with ExpressionEvalHelper { interpreted.initialize(0) assert(interpreted.eval(new UnsafeRow())) } + + test("[SPARK-24872] Replace the symbol '||' of Or operator with 'or'") { --- End diff -- Thank you! I have made changes to the code. @HyukjinKwon --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21826: [SPARK-24872] Replace the symbol '||' of Or operator wit...
Github user httfighter commented on the issue: https://github.com/apache/spark/pull/21826 I have submitted a new code. Could you help me review the code? Thank you! @HyukjinKwon @viirya @gatorsmile @rxin @hvanhovell In Hive, "||" performs the function of STRING concat, the symbol of "Or" operator is 'or' hive> explain select * from aa where id=1 or id=2; and its Filter part is predicate: ((id = 1) or (id = 2)) (type: boolean) In Spark, "||" also performs the function of STRING concat. We can change the symbol of "Or" operator to 'or'. spark-sql> explain extended select * from aa where id=1 or id=2; | == Parsed Logical Plan == 'Project [*] +- 'Filter (('id = 1) or ('id = 2)) +- 'UnresolvedRelation `aa` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21826: [SPARK-24872] Remove the symbol “||” of the “OR”...
Github user httfighter commented on the issue: https://github.com/apache/spark/pull/21826 I have a suggestion that I don't know if it is reasonable. In our spark, since we already support â||â as a string concatenation function, I don't know if we can make such an improvement. In the SQL project part, we will use "||" as a string concatenation function. In the SQL filter part, we will use "||" as the âOrâ operation. @HyukjinKwon @viirya @gatorsmile @rxin @hvanhovell --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21826: [SPARK-24872] Remove the symbol “||” of the “OR”...
Github user httfighter commented on the issue: https://github.com/apache/spark/pull/21826 I did the following tests in mysql. mysql> select "abc" || "def"; ++ | "abc" || "def" | ++ | 0 | ++ mysql> select "abc" "def"; ++ | abc| ++ | abcdef | ++ mysql> select * from aa where id=1 || id=2; +--+--+ | id | name | +--+--+ |1 | sdf | |2 | ader | +--+--+ mysql> select * from aa where id=1 or id=2; +--+--+ | id | name | +--+--+ |1 | sdf | |2 | ader | +--+--+ It seems that it does not act as a string connector, but as an "or" operation. I don't know if my tests are correct. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21826: [SPARK-24872] Remove the symbol “||” of the �...
Github user httfighter commented on a diff in the pull request: https://github.com/apache/spark/pull/21826#discussion_r204274497 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala --- @@ -442,8 +442,6 @@ case class Or(left: Expression, right: Expression) extends BinaryOperator with P override def inputType: AbstractDataType = BooleanType - override def symbol: String = "||" --- End diff -- I am sorry for this error, I will try to avoid it in the future. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21826: [SPARK-24872] Remove the symbol “||” of the �...
Github user httfighter commented on a diff in the pull request: https://github.com/apache/spark/pull/21826#discussion_r204274481 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala --- @@ -442,8 +442,6 @@ case class Or(left: Expression, right: Expression) extends BinaryOperator with P override def inputType: AbstractDataType = BooleanType - override def symbol: String = "||" --- End diff -- I am sorry for this error, I will try to avoid it in the future. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21826: [SPARK-24872] Remove the symbol “||” of the �...
GitHub user httfighter opened a pull request: https://github.com/apache/spark/pull/21826 [SPARK-24872] Remove the symbol â||â of the âORâ operation ## What changes were proposed in this pull request? â||â will perform the function of STRING concat, and it is also the symbol of the "OR" operation. When I want use "||" as "OR" operation, I find that it perform the function of STRING concatï¼ spark-sql> explain extended select * from aa where id==1 || id==2; == Parsed Logical Plan == 'Project [*] +- 'Filter (('id = concat(1, 'id)) = 2) +- 'UnresolvedRelation `aa` spark-sql> select "abc" || "DFF" ; And the result is "abcDFF". In predicates.scala, "||" is the symbol of "Or" operation. Could we remove it? ## How was this patch tested? We can test this patch with unit tests. Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/httfighter/spark SPARK-24872 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21826.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21826 commit fb98029c451023789a2c7fa0e758c6c8790bbaea Author: é©ç°ç°00222924 Date: 2018-07-20T09:19:54Z SPARK-24872 Remove the symbol â||â of the âORâ operation --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21767: SPARK-24804 There are duplicate words in the test title ...
Github user httfighter commented on the issue: https://github.com/apache/spark/pull/21767 Thank you for your comments, @srowen @HyukjinKwon @wangyum. I will try to contribute more valuable issues! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21767: SPARK-24804 There are duplicate words in the titl...
GitHub user httfighter opened a pull request: https://github.com/apache/spark/pull/21767 SPARK-24804 There are duplicate words in the title in the DatasetSuite ## What changes were proposed in this pull request? In DatasetSuite.scala, in the 1299 line, test("SPARK-19896: cannot have circular references in in case class") , there are duplicate words "in in". We can get rid of one. ## How was this patch tested? (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests) (If this patch involves UI changes, please attach a screenshot; otherwise, remove this) Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/httfighter/spark inin Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21767.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21767 commit 1442ba751ef3b6b6212b8b44893d974cac4963b5 Author: é©ç°ç°00222924 Date: 2018-07-14T07:00:20Z SPARK-24804 There are duplicate words in the title in the DatasetSuite --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21023: [SPARK-23949] makes && supports the function of predicat...
Github user httfighter commented on the issue: https://github.com/apache/spark/pull/21023 @gatorsmile Thank you very muchï¼Can you help me see this pr ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21023: [SPARK-23949] makes && supports the function of p...
GitHub user httfighter opened a pull request: https://github.com/apache/spark/pull/21023 [SPARK-23949] makes && supports the function of predicate operator and [https://issues.apache.org/jira/browse/SPARK-23949](https://issues.apache.org/jira/browse/SPARK-23949) [SPARK-23949] makes && supports the function of predicate operator and You can merge this pull request into a Git repository by running: $ git pull https://github.com/httfighter/spark SPARK-23949 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21023.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21023 commit d76d1b88cd2cefe1cdb9f4ce6519429ce7df3ba0 Author: httfighter <han.tiantian@...> Date: 2018-04-10T10:32:40Z [SPARK-23949] makes && supports the function of predicate operator and --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19380: [SPARK-22157] [SQL] The uniux_timestamp method handles t...
Github user httfighter commented on the issue: https://github.com/apache/spark/pull/19380 I understand everyone's worries.But i hava few thoughts. Firstly, the native unix_timestamp itself supports the "-MM-dd HH:mm:ss.SSS" form of the date, but the result is lost in milliseconds when i use it. Obviously, it's a bug. It will give users the wrong results I think this should be fixed. Secondly, unix_timestamp, from_unixtime and to_unix_timestamp all have the similar bug, but related methods only these three. I think the data type of unix time of the three methods should be defined as DoubleTypeï¼not for LongType. Or in milliseconds which will bring more problems. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19380: [SPARK-22157] [SQL] The uniux_timestamp method handles t...
Github user httfighter commented on the issue: https://github.com/apache/spark/pull/19380 In RDMS , unix_timestamp method can keep the milliseconds. For example, execute the command as follows select unix_timestamp("2017-10-10 10:10:20.111") from test; you can get the resultï¼ 1490667020.111 But the native unix_timestamp method of Spark will be lost milliseconds, we want to keep the milliseconds. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19380: [SPARK-22157] [SQL] The uniux_timestamp method ha...
GitHub user httfighter opened a pull request: https://github.com/apache/spark/pull/19380 [SPARK-22157] [SQL] The uniux_timestamp method handles the time field that is lost in mill ## What changes were proposed in this pull request? keep the the mill part of the time field ## How was this patch tested? Add new test cases and update existing test cases Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/httfighter/spark branch_unix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/19380.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #19380 commit 697e61acb8c2ea1d5b379a95c0ab7ae0fb4137b7 Author: httfighter <han.tiant...@zte.com.cn> Date: 2017-09-28T12:03:48Z The uniux_timestamp method handles the time field that is lost in mill --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org