[
https://issues.apache.org/jira/browse/CASSANDRA-16772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Scott Carey updated CASSANDRA-16772:
------------------------------------
Description:
User defined nodetool cleanup uses a HashMap instead of a MultiMap to group the
user provided SSTables by table. This means it only keeps one file per source
table.
It also means the unit test for this component is not sufficient.
As part of https://issues.apache.org/jira/browse/CASSANDRA-16767 I introduced
a helper method on Descriptor:
{code:java}
public static Multimap<ColumnFamilyStore, Descriptor>
fromFilenamesGrouped(Collection<String> filenames) {code}
That should be used instead of the custom logic in
CompactionManager.forceUserDefinedCleanup.
Broken existing code:
{code:java}
HashMap<ColumnFamilyStore, Descriptor> descriptors = Maps.newHashMap();
for (String filename : filenames)
{
// extract keyspace and columnfamily name from filename
Descriptor desc = Descriptor.fromFilename(filename.trim());
if (Schema.instance.getCFMetaData(desc) == null)
{
logger.warn("Schema does not exist for file {}. Skipping.",
filename);
continue;
}
// group by keyspace/columnfamily
ColumnFamilyStore cfs =
Keyspace.open(desc.ksname).getColumnFamilyStore(desc.cfname);
desc = cfs.getDirectories().find(new
File(filename.trim()).getName());
if (desc != null)
descriptors.put(cfs, desc);
} {code}
Contents of helper method introduced in other ticket:
{code:java}
public static Multimap<ColumnFamilyStore, Descriptor>
fromFilenamesGrouped(Collection<String> filenames) {
Multimap<ColumnFamilyStore, Descriptor> descriptors =
ArrayListMultimap.create(); for (String filename : filenames)
{
// extract keyspace and columnfamily name from filename
Descriptor desc = Descriptor.fromFilename(filename.trim());
if (Schema.instance.getCFMetaData(desc) == null)
{
logger.warn("Schema does not exist for file {}. Skipping.",
filename);
continue;
}
// group by keyspace/columnfamily
ColumnFamilyStore cfs =
Keyspace.open(desc.ksname).getColumnFamilyStore(desc.cfname);
desc = cfs.getDirectories().find(new File(filename.trim()).getName());
if (desc != null)
descriptors.put(cfs, desc);
}
return descriptors;
} {code}
was:
User defined nodetool cleanup uses a HashMap instead of a MultiMap to group the
user provided SSTables by table. This means it only keeps one file per source
table.
It also means the unit test for this component is not sufficient.
As part of https://issues.apache.org/jira/browse/CASSANDRA-16767 I introduced
a helper method on Descriptor:
{code:java}
public static Multimap<ColumnFamilyStore, Descriptor>
fromFilenamesGrouped(Collection<String> filenames) {code}
That should be used instead of the custom logic in
CompactionManager.forceUserDefinedCleanup
> User Defined nodetool cleanup only processes one SSTable per table
> ------------------------------------------------------------------
>
> Key: CASSANDRA-16772
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16772
> Project: Cassandra
> Issue Type: Bug
> Reporter: Scott Carey
> Assignee: Scott Carey
> Priority: Normal
>
> User defined nodetool cleanup uses a HashMap instead of a MultiMap to group
> the user provided SSTables by table. This means it only keeps one file per
> source table.
> It also means the unit test for this component is not sufficient.
> As part of https://issues.apache.org/jira/browse/CASSANDRA-16767 I
> introduced a helper method on Descriptor:
> {code:java}
> public static Multimap<ColumnFamilyStore, Descriptor>
> fromFilenamesGrouped(Collection<String> filenames) {code}
> That should be used instead of the custom logic in
> CompactionManager.forceUserDefinedCleanup.
>
> Broken existing code:
> {code:java}
> HashMap<ColumnFamilyStore, Descriptor> descriptors =
> Maps.newHashMap(); for (String filename : filenames)
> {
> // extract keyspace and columnfamily name from filename
> Descriptor desc = Descriptor.fromFilename(filename.trim());
> if (Schema.instance.getCFMetaData(desc) == null)
> {
> logger.warn("Schema does not exist for file {}. Skipping.",
> filename);
> continue;
> }
> // group by keyspace/columnfamily
> ColumnFamilyStore cfs =
> Keyspace.open(desc.ksname).getColumnFamilyStore(desc.cfname);
> desc = cfs.getDirectories().find(new
> File(filename.trim()).getName());
> if (desc != null)
> descriptors.put(cfs, desc);
> } {code}
>
> Contents of helper method introduced in other ticket:
> {code:java}
> public static Multimap<ColumnFamilyStore, Descriptor>
> fromFilenamesGrouped(Collection<String> filenames) {
> Multimap<ColumnFamilyStore, Descriptor> descriptors =
> ArrayListMultimap.create(); for (String filename : filenames)
> {
> // extract keyspace and columnfamily name from filename
> Descriptor desc = Descriptor.fromFilename(filename.trim());
> if (Schema.instance.getCFMetaData(desc) == null)
> {
> logger.warn("Schema does not exist for file {}. Skipping.",
> filename);
> continue;
> }
> // group by keyspace/columnfamily
> ColumnFamilyStore cfs =
> Keyspace.open(desc.ksname).getColumnFamilyStore(desc.cfname);
> desc = cfs.getDirectories().find(new
> File(filename.trim()).getName());
> if (desc != null)
> descriptors.put(cfs, desc);
> }
> return descriptors;
> } {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]