Github user dbtsai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21235#discussion_r186240261
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala ---
    @@ -339,9 +339,16 @@ final class DataFrameWriter[T] private[sql](ds: 
Dataset[T]) {
         }
       }
     
    -  private def assertNotBucketed(operation: String): Unit = {
    -    if (numBuckets.isDefined || sortColumnNames.isDefined) {
    -      throw new AnalysisException(s"'$operation' does not support 
bucketing right now")
    +  private def assertNotBucketedOrSorted(operation: String): Unit = {
    +    (numBuckets.isDefined, sortColumnNames.isDefined) match {
    +      case (true, true) =>
    +        throw new AnalysisException(
    +          s"'$operation' does not support bucketing and sorting right now")
    +      case (true, false) =>
    +        throw new AnalysisException(s"'$operation' does not support 
bucketing right now")
    +      case (false, true) =>
    +        throw new AnalysisException(s"'$operation' does not support 
sorting right now")
    --- End diff --
    
    I know this is the sorting in each bucket. 
    
    If a user just calls `writer.sortBy` without calling `bucketBy`, the user 
will get `s"'$operation' does not support bucketing right now"` which is hard 
to understand what's going on. 
    
    For the case of sortBy is enabled, and bucketBy is disabled, how about I 
change the error message to `sortBy must be used together with bucketBy, and 
'$operation' does not support bucketBy right now`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to