[
https://issues.apache.org/jira/browse/CASSANDRA-10358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14987144#comment-14987144
]
Sylvain Lebresne commented on CASSANDRA-10358:
----------------------------------------------
Now that I read this more carefully, I'm actually not all that sure I
understand what you're trying to do here.
You're trying to create sstables with no overlap in token ranges, so that does
mean you're using a _sorted_ {{CQLSStableWriter}} right? Otherwise, how could
we ever ensure no token overlap when we have no way to know in which order
partition swill be passed to the writer (we'd have to buffer everything ever in
memory). And if you use a sorted writer, then you shouldn't care about
CASSANDRA-7360 since it only affects unsorted writers.
So I was actually too quick at calling your point #2 above a bug, it's not. I
don't think there is a practical way for an unsorted writer to generate
sstables with non overlapping token ranges (and CASSANDRA-7360 is only a very
minor part of the problem).
The intent behind {{CQLSSTableWriter}} is that the sstable generated should be
loaded through {{sstableloader}}, which imply both that overlapping sstables
are not a problem (even if you use LCS, the sstables will start at level 0
which can have overlaps) and that you can generate sstables in parallel without
needing to tweak the filename: the sstables will be renamed so they don't
conflict once loaded into the node.
Overall, it seems what you're trying to achieve is not something
{{CQLSSTableWriter}} was designed for. We're happy to make that design evolve
if that's sensible, but I think we'd need more clarity into what your exact use
case is (and why using {{sstableloader}} is not good enough in particular).
> Allow CQLSSTableWriter.Builder to use custom AbstractSSTableSimpleWriter
> -------------------------------------------------------------------------
>
> Key: CASSANDRA-10358
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10358
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Andre Turgeon
> Priority: Minor
> Attachments: SSTableWriterCreationStrategy.patch, patch.txt
>
>
> I've created a patch for your consideration.
> This change to CQLSSTableWriter allows for a custom
> AbstractSSTableSimpleWriter to be specified.
> I needed this for a bulkload process I wrote. I believe the change would be
> beneficial for other people as well.
> Below are the reasons I needed a custom implementation of
> AbstractSSTableSimpleWriter:
> 1) The available implementations of AbstractSSTableSimpleWriter do not
> provide a way to specify the filename (or rather revision) of the sstable. I
> needed to control the name because my bulkload process write sstables in
> parallel (on multiple machines) and I wish to avoid name collisions.
> 2) I discovered a problem with SSTableSimpleUnsortedWriter where it creates
> invalid level-compaction-style sstables; It allows a partition to span 2
> sstables which violates the "no overlap of token ranges" constraint of level
> compaction.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)