[
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14617221#comment-14617221
]
Joshua McKenzie commented on CASSANDRA-6477:
--------------------------------------------
h6. Overall:
Needs comments. Class level, javadoc where appropriate, etc.
h6. {{CFMetaData}}
* duplicate entry for comparison if triggers added in CFMetadata.triggers
* nit: space after CFMetaData.getMaterializedViews function end
h6. {{MaterializedViewManager}}
* buildIfRequired: scope of whether it's required or skipped isn't in this
method or within view.build(), so it looks like it's just building (or at least
submitting the builder to the CompactionManager)
* You're using SystemKeyspace.setIndexRemoved to mark the MV removed in
removeMaterializedView but not the parallel setIndexBuilt to set them as built.
* allViews has unnecessary genericity
* nits:
** Some unused imports
** Consider renaming reload to indicate what it's reloading (e.g. reloadViews,
reloadChickens, reloadDolphins)
h6. {{MaterializedView}}
* createForDeletionInfo:
** Consider caching/precomputing the results of CFMetaData.hasComplexColumns
rather than iterating through them twice on each call (once on
CFMetaData.hascomplexColumns, again in for loop in createForDeletionInfo)
*** Perhaps hooking into addOrReplaceColumnDefinition, removeColumnDefinition
and updating MV's with the data, keep a set inside MaterializedView and
reference that
* Don't need MaterializedView.reload() as it's just a passthrough to build and
used in 1 place
* Tighten up scoping on member variables - some package private that can be
explicitly private
* ctor: Duplication in the ctor in treatment of MVDefinition partition and
clustering columns - could use a slight refactor to a function.
* createTombstone / createComplexTombstone: refactor out building
viewClusteringValues into method to remove duplication
* targetPartitionKey: collapse spacing on:
{noformat}
return viewCfs
.partitioner
.decorateKey(CFMetaData
.serializePartitionKey(viewCfs
.metadata
.getKeyValidatorAsClusteringComparator()
.make(partitionKey)));
{noformat}
to something like:
{noformat}
return
viewCfs.partitioner.decorateKey(CFMetaData.serializePartitionKey(viewCfs.metadata
.getKeyValidatorAsClusteringComparator()
.make(partitionKey)));
{noformat}
* createPartitionTombstonesForUpdates: Can simply return mutation @ end of
function
* createForDeletionInfo: unused parameter consistency
* Inconsistent naming - using mutationUnits scattered throughout vs.
LiveRowState
* Rename {{query}} to {{queryMaterializedViewData}} or something similar that
denotes what you're querying. Perhaps {{buildMVData}}
* nits:
** Some unused imports
** ctor: Change nonPrimaryKeyCol to allPrimaryKeyCol as true, flip to false
when non found so we're not assigning a double-negative to
targetHasAllPrimaryKeyColumns
** extra space in MaterializedView.createForDeletionInfo
** ctor declaration fits on 1 line < 120 char, same for some other lines that
have been wrapped
** hte in javadoc for targetPartitionKey
h6. {{MaterializedViewDefinition}}
* Consider caching sets of columns in the MV rather than pulling from
CFMetaData and iterating. This would allow for faster lookup and simpler code
than having to iterate across all members (MaterializedViewDefinition.selects
for instance).
* Consider using Sets of columns rather than Lists internally, would remove a
lot of O(N) lookups and duplicate code logic scattered throughout the class
(selects, renameColumn, etc)
* selects(): if included.isEmpty() is apparently an indicator that it includes
all. We should 1) comment that and 2) wrap a method around that behavior, e.g.
'boolean isAllInclusive()' that documents that behavior in the code.
* nits:
** I prefer copy constructors to a .copy() method and I think I've seen us err
on the side of the copy constructor. I could be wrong here.
** assert included != null and !isEmpty on ctor
** extra whitespace in renameColumn @ top of method, @ end of method
h6. {{LiveRowState}}
* The name conflates with LivenessInfo (at least for me) which we discussed on
IRC. My biggest hurdle to grokking this class was un-plugging that and
re-plugging the idea that it's really about materializing the data for the
MaterializedView. Perhaps rename to something like {{TemporalRow}}, with
{{LRSCell}} becoming {{TemporalCell}}? While LRSCell doesn't really have any
temporal data above and beyond a general Cell, it does contain its own
{{reconcile}} that takes temporality into account which makes sense.
* Instead of {{addUnit}} / {{getExistingUnit}} in LiveRowState.Set, if going
with the above these could be named {{addRow}}, {{getExistingRow}}
* Think we can get rid of getInternedUnit entirely and just have a section in
addUnit to initialize empty entries
* Consider having LRSCell implement db.rows.Cell and adding a
{{Conflicts.resolveCells(Cell left, Cell right, int nowInSec)}} method that
passes through to {{Conficts.resolveRegular}}. Simplifies up a few different
users of the method throughout the code-base.
* Can do away w/PrivateResolver vs. Resolver. LRSCell being private should
prevent non-class anonymous instantiations of the interface.
* Various unused methods and constructors
* Comment on whether or not retention of ordering is important in {{baseSlice}}
as you could do away w/interim ByteBuffer array if not. I believe it is, but
worth clarifying.
* nit: whitespace inconsistencies in {{values(...)}}
I definitely have more to review but figured I'd get the feedback I have thus
far into here. I'm considering doing subsequent reviews as a branch I push to
github w/comments inline (similar to what Benedict's been doing lately) with
feedback as putting this quantity of feedback in Jira comments is a burden for
both you and I - thoughts?
> Materialized Views (was: Global Indexes)
> ----------------------------------------
>
> Key: CASSANDRA-6477
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
> Project: Cassandra
> Issue Type: New Feature
> Components: API, Core
> Reporter: Jonathan Ellis
> Assignee: Carl Yeksigian
> Labels: cql
> Fix For: 3.0 beta 1
>
> Attachments: test-view-data.sh
>
>
> Local indexes are suitable for low-cardinality data, where spreading the
> index across the cluster is a Good Thing. However, for high-cardinality
> data, local indexes require querying most nodes in the cluster even if only a
> handful of rows is returned.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)