[GitHub] [cassandra] jacek-lewandowski commented on a change in pull request #1031: Fix queries on empty partitions with static data (CASSANDRA-16686)

GitBox Tue, 01 Jun 2021 05:10:55 -0700


jacek-lewandowski commented on a change in pull request #1031:
URL: https://github.com/apache/cassandra/pull/1031#discussion_r643043366




##########
File path: src/java/org/apache/cassandra/db/filter/ColumnFilter.java
##########
@@ -49,111 +48,166 @@
  * in its request.
  *
  * The reason for distinguishing those 2 sets is that due to the CQL semantic 
(see #6588 for more details), we
- * often need to internally fetch all regular columns for the queried table, 
but can still do some optimizations for
- * those columns that are not directly queried by the user (see #10657 for 
more details).
+ * often need to internally fetch all regular columns or all columns for the 
queried table, but can still do some
+ * optimizations for those columns that are not directly queried by the user 
(see #10657 for more details).
  *
  * Note that in practice:
  *   - the _queried_ columns set is always included in the _fetched_ one.
- *   - whenever those sets are different, we know 1) the _fetched_ set 
contains all regular columns for the table and 2)
- *     _fetched_ == _queried_ for static columns, so we don't have to record 
this set, we just keep a pointer to the
- *     table metadata. The only set we concretely store is thus the _queried_ 
one.
+ *   - whenever those sets are different, the _fetched_ columns can contains 
either all the regular columns and
+ *     the static columns queried by the user or all the regular and static 
queries. If the query is a partition level
+ *     query (no restrictions on clustering or regular columns) all the static 
columns will need to be fetched as
+ *     some data will need to be returned to the user if the partition has no 
row but some static data. For all the
+ *     other scenarios only the all the regular columns are required.
  *   - in the special case of a {@code SELECT *} query, we want to query all 
columns, and _fetched_ == _queried.
- *     As this is a common case, we special case it by keeping the _queried_ 
set {@code null} (and we retrieve
- *     the columns through the metadata pointer).
+ *     As this is a common case, we special case it by using a specific 
subclass for it.
  *
  * For complex columns, this class optionally allows to specify a subset of 
the cells to query for each column.
  * We can either select individual cells by path name, or a slice of them. 
Note that this is a sub-selection of
  * _queried_ cells, so if _fetched_ != _queried_, then the cell selected by 
this sub-selection are considered
  * queried and the other ones are considered fetched (and if a column has some 
sub-selection, it must be a queried
  * column, which is actually enforced by the Builder below).
  */
-public class ColumnFilter
+public abstract class ColumnFilter
 {
     private final static Logger logger = 
LoggerFactory.getLogger(ColumnFilter.class);
 
     public static final ColumnFilter NONE = 
selection(RegularAndStaticColumns.NONE);
 
     public static final Serializer serializer = new Serializer();
 
-    // True if _fetched_ includes all regular columns (and any static in 
_queried_), in which case metadata must not be
-    // null. If false, then _fetched_ == _queried_ and we only store _queried_.
-    @VisibleForTesting
-    final boolean fetchAllRegulars;
+    /**
+     * The fetching strategy for the different queries.
+     */
+    private enum FetchingStrategy
+    {
+        /**
+         * This strategy will fetch all the regular and static columns.
+         *
+         * <p>According to the CQL semantic a partition exist if has at least 
one row or one of its static columns not null.
+         * For queries that have no restrictions on the clustering or regular 
columns, C* returns will return some data for

Review comment:
       nit: `C* returns will return` -> `C* returns` or `C* will return`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [cassandra] jacek-lewandowski commented on a change in pull request #1031: Fix queries on empty partitions with static data (CASSANDRA-16686)

Reply via email to