qwtsc opened a new issue, #2358: URL: https://github.com/apache/incubator-hugegraph/issues/2358
### Bug Type (问题类型) rest-api (结果不合预期) ### Before submit - [X] 我已经确认现有的 [Issues](https://github.com/apache/hugegraph/issues) 与 [FAQ](https://hugegraph.apache.org/docs/guides/faq/) 中没有相同 / 重复问题 (I have confirmed and searched that there are no similar problems in the historical issue and documents) ### Environment (环境信息) hugegraph with release-1.0 branch, but I doubt this is not fixed in master branch as well (need to double-check). Mysql is used as backend storage. ### Expected & Actual behavior (期望与实际表现)  A very strange situation happens when concurrently appending property key to one vertex label. Some property key ids of 0 appears in the column of the vertex label table. After one whole day's investigation and code review, I found that this is due to weak concurrency control in VertexLabel Update Api. ### problematic code ```java @Override public VertexLabel append() { ---> // need to add lock here! ---> // we will get the same vertex label object in the memory. VertexLabel vertexLabel = this.vertexLabelOrNull(this.name); if (vertexLabel == null) { throw new NotFoundException("Can't update vertex label '%s' " + "since it doesn't exist", this.name); } this.checkStableVars(); this.checkProperties(Action.APPEND); this.checkNullableKeys(Action.APPEND); Userdata.check(this.userdata, Action.APPEND); for (String key : this.properties) { PropertyKey propertyKey = this.graph().propertyKey(key); ----> // this will add new property key id to concurrent-unsafe HashSet vertexLabel.property(propertyKey.id()); } for (String key : this.nullableKeys) { PropertyKey propertyKey = this.graph().propertyKey(key); vertexLabel.nullableKey(propertyKey.id()); } vertexLabel.userdata(this.userdata); this.graph().updateVertexLabel(vertexLabel); return vertexLabel; } ``` and then ```java private void saveSchema(SchemaElement schema, boolean update, Consumer<SchemaElement> updateCallback) { // Lock for schema update LockUtil.Locks locks = new LockUtil.Locks(this.params().name()); try { locks.lockWrites(LockUtil.hugeType2Group(schema.type()), schema.id()); if (updateCallback != null) { // NOTE: Do schema update in the lock block updateCallback.accept(schema); } // System schema just put into SystemSchemaStore in memory if (schema.longId() < 0L) { this.systemSchemaStore.add(schema); return; } ----> // this will then read the cached object with the concurrent-unsafe hash set inside it. BackendEntry entry = this.serialize(schema); this.beforeWrite(); if (update) { this.doUpdateIfPresent(entry); // TODO: also support updateIfPresent for index-update this.indexTx.updateNameIndex(schema, false); } else { // TODO: support updateIfAbsentProperty (property: label name) this.doUpdateIfAbsent(entry); this.indexTx.updateNameIndex(schema, false); } this.afterWrite(); } finally { locks.unlock(); } } ``` and then look at MysqlSerializer ```java @Override public BackendEntry writeVertexLabel(VertexLabel vertexLabel) { TableBackendEntry entry = newBackendEntry(vertexLabel); entry.column(HugeKeys.ID, vertexLabel.id().asLong()); entry.column(HugeKeys.NAME, vertexLabel.name()); entry.column(HugeKeys.ID_STRATEGY, vertexLabel.idStrategy().code()); entry.column(HugeKeys.PROPERTIES, ---> here is issue this.toLongSet(vertexLabel.properties())); entry.column(HugeKeys.PRIMARY_KEYS, ---> here is issue this.toLongList(vertexLabel.primaryKeys())); entry.column(HugeKeys.NULLABLE_KEYS, ---> here is issue this.toLongSet(vertexLabel.nullableKeys())); entry.column(HugeKeys.INDEX_LABELS, ---> here is issue this.toLongSet(vertexLabel.indexLabels())); this.writeEnableLabelIndex(vertexLabel, entry); this.writeUserdata(vertexLabel, entry); entry.column(HugeKeys.STATUS, vertexLabel.status().code()); entry.column(HugeKeys.TTL, vertexLabel.ttl()); entry.column(HugeKeys.TTL_START_TIME, vertexLabel.ttlStartTime().asLong()); return entry; } ``` and then let's look at the toLongSet/to.LongList method. `Collection<Id> ids` is actually the cached vertex label's property and may be modified by other threads concurrently, so we cannot use ids.size() and for-loop to copy the items from collection to array! This will lead to the tailed unwanted 0s in the serialized result! ```java @Override protected Object toLongList(Collection<Id> ids) { ---> // this is the bug! long[] values = new long[ids.size()]; int i = 0; ---> // this is the bug! for (Id id : ids) { values[i++] = id.asLong(); } return JsonUtil.toJson(values); } ``` My advice is to add a read lock in the vertex label api to protect this code zone. ### Vertex/Edge example (问题点 / 边数据举例) _No response_ ### Schema [VertexLabel, EdgeLabel, IndexLabel] (元数据结构) _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@hugegraph.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org