[GitHub] [spark] cloud-fan commented on a change in pull request #24094: [SPARK-27162][SQL] Add new method getOriginalMap in CaseInsensitiveStringMap
cloud-fan commented on a change in pull request #24094: [SPARK-27162][SQL] Add new method getOriginalMap in CaseInsensitiveStringMap URL: https://github.com/apache/spark/pull/24094#discussion_r266361233 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/util/CaseInsensitiveStringMap.java ## @@ -35,16 +37,29 @@ */ @Experimental public class CaseInsensitiveStringMap implements Map { + private final Logger logger = LoggerFactory.getLogger(CaseInsensitiveStringMap.class); + + private String unsupportedOperationMsg = "CaseInsensitiveStringMap is read-only."; Review comment: nit: ``` private void failAsReadonly() { throw new UnsupportedOperationException("CaseInsensitiveStringMap is read-only.") } ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #24094: [SPARK-27162][SQL] Add new method getOriginalMap in CaseInsensitiveStringMap
cloud-fan commented on a change in pull request #24094: [SPARK-27162][SQL] Add new method getOriginalMap in CaseInsensitiveStringMap URL: https://github.com/apache/spark/pull/24094#discussion_r266361233 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/util/CaseInsensitiveStringMap.java ## @@ -35,16 +37,29 @@ */ @Experimental public class CaseInsensitiveStringMap implements Map { + private final Logger logger = LoggerFactory.getLogger(CaseInsensitiveStringMap.class); + + private String unsupportedOperationMsg = "CaseInsensitiveStringMap is read-only."; Review comment: nit: ``` private void failAsReadonly() { throw new UnsupportedOperationException("CaseInsensitiveStringMap is read-only.") } ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #24094: [SPARK-27162][SQL] Add new method getOriginalMap in CaseInsensitiveStringMap
cloud-fan commented on a change in pull request #24094: [SPARK-27162][SQL] Add new method getOriginalMap in CaseInsensitiveStringMap URL: https://github.com/apache/spark/pull/24094#discussion_r266317729 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/util/CaseInsensitiveStringMap.java ## @@ -78,24 +83,22 @@ public String get(Object key) { @Override public String put(String key, String value) { -return delegate.put(toLowerCase(key), value); +throw new UnsupportedOperationException(); Review comment: Why do we need a warning message right before throwing an exception? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #24094: [SPARK-27162][SQL] Add new method getOriginalMap in CaseInsensitiveStringMap
cloud-fan commented on a change in pull request #24094: [SPARK-27162][SQL] Add new method getOriginalMap in CaseInsensitiveStringMap URL: https://github.com/apache/spark/pull/24094#discussion_r266309297 ## File path: sql/core/src/main/scala/org/apache/spark/sql/internal/SessionState.scala ## @@ -32,7 +34,7 @@ import org.apache.spark.sql.catalyst.parser.ParserInterface import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan import org.apache.spark.sql.execution._ import org.apache.spark.sql.streaming.StreamingQueryManager -import org.apache.spark.sql.util.{ExecutionListenerManager, QueryExecutionListener} +import org.apache.spark.sql.util.{CaseInsensitiveStringMap, ExecutionListenerManager, QueryExecutionListener} Review comment: unnecessary changes This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #24094: [SPARK-27162][SQL] Add new method getOriginalMap in CaseInsensitiveStringMap
cloud-fan commented on a change in pull request #24094: [SPARK-27162][SQL] Add new method getOriginalMap in CaseInsensitiveStringMap URL: https://github.com/apache/spark/pull/24094#discussion_r266309127 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/util/CaseInsensitiveStringMap.java ## @@ -78,24 +83,22 @@ public String get(Object key) { @Override public String put(String key, String value) { -return delegate.put(toLowerCase(key), value); +throw new UnsupportedOperationException(); Review comment: let's add the error message to say the map is read only. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #24094: [SPARK-27162][SQL] Add new method getOriginalMap in CaseInsensitiveStringMap
cloud-fan commented on a change in pull request #24094: [SPARK-27162][SQL] Add new method getOriginalMap in CaseInsensitiveStringMap URL: https://github.com/apache/spark/pull/24094#discussion_r266309009 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/util/CaseInsensitiveStringMap.java ## @@ -40,11 +40,16 @@ public static CaseInsensitiveStringMap empty() { return new CaseInsensitiveStringMap(new HashMap<>(0)); } + private final Map original; + private final Map delegate; public CaseInsensitiveStringMap(Map originalMap) { -this.delegate = new HashMap<>(originalMap.size()); -putAll(originalMap); +original = new HashMap<>(originalMap); +delegate = new HashMap<>(originalMap.size()); +for (Map.Entry entry : originalMap.entrySet()) { + delegate.put(toLowerCase(entry.getKey()), entry.getValue()); Review comment: If the key is already in the case insensitive map, we should fail and say duplicated keys detected. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #24094: [SPARK-27162][SQL] Add new method getOriginalMap in CaseInsensitiveStringMap
cloud-fan commented on a change in pull request #24094: [SPARK-27162][SQL] Add new method getOriginalMap in CaseInsensitiveStringMap URL: https://github.com/apache/spark/pull/24094#discussion_r266308703 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/util/CaseInsensitiveStringMap.java ## @@ -40,11 +40,16 @@ public static CaseInsensitiveStringMap empty() { return new CaseInsensitiveStringMap(new HashMap<>(0)); } + private final Map original; + private final Map delegate; public CaseInsensitiveStringMap(Map originalMap) { -this.delegate = new HashMap<>(originalMap.size()); -putAll(originalMap); +original = new HashMap<>(originalMap); +delegate = new HashMap<>(originalMap.size()); +for (Map.Entry entry : originalMap.entrySet()) { Review comment: is the `? extends String` required? Can we just use `String`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #24094: [SPARK-27162][SQL] Add new method getOriginalMap in CaseInsensitiveStringMap
cloud-fan commented on a change in pull request #24094: [SPARK-27162][SQL] Add new method getOriginalMap in CaseInsensitiveStringMap URL: https://github.com/apache/spark/pull/24094#discussion_r265911838 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/util/CaseInsensitiveStringMap.java ## @@ -78,11 +81,13 @@ public String get(Object key) { @Override public String put(String key, String value) { +original.put(key, value); Review comment: We can just throw exception in these methods, and say this map is readonly. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #24094: [SPARK-27162][SQL] Add new method getOriginalMap in CaseInsensitiveStringMap
cloud-fan commented on a change in pull request #24094: [SPARK-27162][SQL] Add new method getOriginalMap in CaseInsensitiveStringMap URL: https://github.com/apache/spark/pull/24094#discussion_r265843483 ## File path: sql/core/src/main/scala/org/apache/spark/sql/internal/SessionState.scala ## @@ -96,6 +98,11 @@ private[sql] class SessionState( hadoopConf } + def newHadoopConfWithCaseInsensitiveOptions(options: CaseInsensitiveStringMap): Configuration = { Review comment: Then we should document it in `CaseInsensitiveMap`. data source developers can't access `SessionState` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #24094: [SPARK-27162][SQL] Add new method getOriginalMap in CaseInsensitiveStringMap
cloud-fan commented on a change in pull request #24094: [SPARK-27162][SQL] Add new method getOriginalMap in CaseInsensitiveStringMap URL: https://github.com/apache/spark/pull/24094#discussion_r265842993 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/util/CaseInsensitiveStringMap.java ## @@ -78,11 +81,13 @@ public String get(Object key) { @Override public String put(String key, String value) { +original.put(key, value); Review comment: The thing worries me most is the inconsistency between the case insensitive map and the original map. I think we should either fail or keep the latter entry if `a -> 1, A -> 2` appears together. One thing we can simplify is, `CaseInsensitiveStringMap` is read by data source and can be read-only. Then it can be easier to resolve conflicting entries at the beginning. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #24094: [SPARK-27162][SQL] Add new method getOriginalMap in CaseInsensitiveStringMap
cloud-fan commented on a change in pull request #24094: [SPARK-27162][SQL] Add new method getOriginalMap in CaseInsensitiveStringMap URL: https://github.com/apache/spark/pull/24094#discussion_r265837482 ## File path: sql/core/src/main/scala/org/apache/spark/sql/internal/SessionState.scala ## @@ -96,6 +98,11 @@ private[sql] class SessionState( hadoopConf } + def newHadoopConfWithCaseInsensitiveOptions(options: CaseInsensitiveStringMap): Configuration = { Review comment: I don't think we should pollute `SessionState` with the case insensitive map stuff. Can we inline this method? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #24094: [SPARK-27162][SQL] Add new method getOriginalMap in CaseInsensitiveStringMap
cloud-fan commented on a change in pull request #24094: [SPARK-27162][SQL] Add new method getOriginalMap in CaseInsensitiveStringMap URL: https://github.com/apache/spark/pull/24094#discussion_r265837478 ## File path: sql/core/src/main/scala/org/apache/spark/sql/internal/SessionState.scala ## @@ -96,6 +98,11 @@ private[sql] class SessionState( hadoopConf } + def newHadoopConfWithCaseInsensitiveOptions(options: CaseInsensitiveStringMap): Configuration = { Review comment: I don't think we should pollute `SessionState` with the case insensitive map stuff. Can we inline this method? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #24094: [SPARK-27162][SQL] Add new method getOriginalMap in CaseInsensitiveStringMap
cloud-fan commented on a change in pull request #24094: [SPARK-27162][SQL] Add new method getOriginalMap in CaseInsensitiveStringMap URL: https://github.com/apache/spark/pull/24094#discussion_r265837236 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/util/CaseInsensitiveStringMap.java ## @@ -40,9 +40,12 @@ public static CaseInsensitiveStringMap empty() { return new CaseInsensitiveStringMap(new HashMap<>(0)); } + private final Map original; + private final Map delegate; public CaseInsensitiveStringMap(Map originalMap) { +this.original = new HashMap<>(originalMap); Review comment: this should be `new HashMap<>(originalMap.size);`, otherwise we add data to it twice. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org