Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/21235#discussion_r186240261 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala --- @@ -339,9 +339,16 @@ final class DataFrameWriter[T] private[sql](ds: Dataset[T]) { } } - private def assertNotBucketed(operation: String): Unit = { - if (numBuckets.isDefined || sortColumnNames.isDefined) { - throw new AnalysisException(s"'$operation' does not support bucketing right now") + private def assertNotBucketedOrSorted(operation: String): Unit = { + (numBuckets.isDefined, sortColumnNames.isDefined) match { + case (true, true) => + throw new AnalysisException( + s"'$operation' does not support bucketing and sorting right now") + case (true, false) => + throw new AnalysisException(s"'$operation' does not support bucketing right now") + case (false, true) => + throw new AnalysisException(s"'$operation' does not support sorting right now") --- End diff -- I know this is the sorting in each bucket. If a user just calls `writer.sortBy` without calling `bucketBy`, the user will get `s"'$operation' does not support bucketing right now"` which is hard to understand what's going on. For the case of sortBy is enabled, and bucketBy is disabled, how about I change the error message to `sortBy must be used together with bucketBy, and '$operation' does not support bucketBy right now`
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org