Yeah, you've basically got it right. Its a bug. Please open a JIRA (and perhaps take a stab at a patch). Its low on my priority list as we mostly just do updates or delete whole rows..
-clint On Tue, Jul 21, 2009 at 1:04 PM, Andrew McCall <[email protected]>wrote: > Hi, > > I've been using the IndexedTable stuff from contrib and come across a bit > of an issue. > > When I delete a column my indexes are removed for that column. I've run > through the code in IndexedRegion and used very similar code in my own > classes to recreate the index after I've run the delete. > > I've also noticed that if I run a Put after the Delete then the index will > be re-created. > > Neither the Delete or the subsequent Put in the second example uses any of > the columns that are part of the index (either indexed or additional > columns). > > If I'm not mistaken the problem lies in the code to rebuild the index from > org.apache.hadoop.hbase.regionserver.tableindexed.IndexedRegion: > > @Override > public void delete(Delete delete, final Integer lockid, boolean > writeToWAL) > throws IOException { > > if (!getIndexes().isEmpty()) { > // Need all columns > NavigableSet<byte[]> neededColumns = > getColumnsForIndexes(getIndexes()); > > Get get = new Get(delete.getRow()); > for (byte [] col : neededColumns) { > get.addColumn(col); > } > > Result oldRow = super.get(get, null); > SortedMap<byte[], byte[]> oldColumnValues = convertToValueMap(oldRow); > > > for (IndexSpecification indexSpec : getIndexes()) { > removeOldIndexEntry(indexSpec, delete.getRow(), oldColumnValues); > } > > // Handle if there is still a version visible. > if (delete.getTimeStamp() != HConstants.LATEST_TIMESTAMP) { > get.setTimeRange(1, delete.getTimeStamp()); > oldRow = super.get(get, null); > SortedMap<byte[], byte[]> currentColumnValues = > convertToValueMap(oldRow); > LOG.debug("There are " + currentColumnValues + " entries to > re-index"); > > for (IndexSpecification indexSpec : getIndexes()) { > if (IndexMaintenanceUtils.doesApplyToIndex(indexSpec, > currentColumnValues)) { > updateIndex(indexSpec, delete.getRow(), currentColumnValues); > } > } > } > } > super.delete(delete, lockid, writeToWAL); > } > > > I'm not sure if I've got this right but it seems that any delete will > remove the indexes, but they will only be rebuilt if the delete is of a > previous version for the row, and then the index will then be built using > data from the version prior to that which you've just deleted - which seems > to mean it would, more often than not, always be out of date. > > More broadly it also occurs to me that it may make sense not to delete the > indexes at all unless the Delete would otherwise affect them. In my case > there isn't really any reason to remove the indexes, the column I'm deleting > is completely unrelated. > > Cheers, > Andrew > > >
