[
https://issues.apache.org/jira/browse/PHOENIX-2783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15209304#comment-15209304
]
Sergey Soldatov commented on PHOENIX-2783:
------------------------------------------
Yep. That's the test I used:
{noformat}
@Benchmark
public void listMultimapTest() {
ListMultimap<String, Pair> columnsByName = ArrayListMultimap.create
(NUM, 1);
for (Pair column : l) {
String familyName = column.getS1();
String columnName = column.getS2();
if (columnsByName.put(columnName, column)) {
int count = 0;
for (Pair dupColumn : columnsByName.get(columnName)) {
if (Objects.equal(familyName, dupColumn.getS1())) {
count++;
if (count > 1) {
System.out.println("Found duplicate");
break;
}
}
}
}
}
}
@Benchmark
public void hashSetTest() {
HashSet<String> set = new HashSet<String>(NUM);
for (Pair column : l) {
String familyName = column.getS1();
String columnName = column.getS2();
if(!set.add(familyName+"."+columnName)) {
System.out.println("Found duplicate");
break;
}
}
}
{noformat}
Values for pairs were UUID generated, so cycles were running for all values
without breaking for duplicates ( the worst case). I agree that using HashSet
looks cleaner. I have no objection to do it in that way. But the way, should I
refactor the check in PTableImpl in the same way?
> Creating secondary index with duplicated columns makes the catalog corrupted
> ----------------------------------------------------------------------------
>
> Key: PHOENIX-2783
> URL: https://issues.apache.org/jira/browse/PHOENIX-2783
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 4.7.0
> Reporter: Sergey Soldatov
> Assignee: Sergey Soldatov
> Attachments: PHOENIX-2783-1.patch, PHOENIX-2783-2.patch
>
>
> Simple example
> {noformat}
> create table x (t1 varchar primary key, t2 varchar, t3 varchar);
> create index idx on x (t2) include (t1,t3,t3);
> {noformat}
> cause an exception that duplicated column was detected, but the client
> updates the catalog before throwing it and makes it unusable. All following
> attempt to use table x cause an exception ArrayIndexOutOfBounds. This problem
> was discussed on the user list recently.
> The cause of the problem is that check for duplicated columns happen in
> PTableImpl after MetaDataClient complete the server createTable.
> The simple way to fix is to add a similar check in MetaDataClient before
> createTable is called.
> Possible someone can suggest a more elegant way to fix it?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)