qwtsc opened a new issue, #2358:
URL: https://github.com/apache/incubator-hugegraph/issues/2358

   ### Bug Type (问题类型)
   
   rest-api (结果不合预期)
   
   ### Before submit
   
   - [X] 我已经确认现有的 [Issues](https://github.com/apache/hugegraph/issues) 与 
[FAQ](https://hugegraph.apache.org/docs/guides/faq/) 中没有相同 / 重复问题 (I have 
confirmed and searched that there are no similar problems in the historical 
issue and documents)
   
   ### Environment (环境信息)
   
   hugegraph with release-1.0 branch, but I doubt this is not fixed in master 
branch as well (need to double-check). Mysql is used as backend storage.
   
   ### Expected & Actual behavior (期望与实际表现)
   
   
![image](https://github.com/apache/incubator-hugegraph/assets/34210641/b7141855-c1bb-426e-8cdc-40e286812df6)
   A very strange situation happens when concurrently appending property key to 
one vertex label. Some property key ids of 0 appears in the column of the 
vertex label table. After one whole day's investigation and code review, I 
found that this is due to weak concurrency control in VertexLabel Update Api.
   
   ### problematic code
   ```java
       @Override
       public VertexLabel append() {
      ---> // need to add lock here!
      ---> // we will get the same vertex label object in the memory.
           VertexLabel vertexLabel = this.vertexLabelOrNull(this.name);
           if (vertexLabel == null) {
               throw new NotFoundException("Can't update vertex label '%s' " +
                                           "since it doesn't exist", this.name);
           }
   
           this.checkStableVars();
           this.checkProperties(Action.APPEND);
           this.checkNullableKeys(Action.APPEND);
           Userdata.check(this.userdata, Action.APPEND);
   
           for (String key : this.properties) {
               PropertyKey propertyKey = this.graph().propertyKey(key);
         ---->  // this will add new property key id to concurrent-unsafe 
HashSet
               vertexLabel.property(propertyKey.id());
           }
           for (String key : this.nullableKeys) {
               PropertyKey propertyKey = this.graph().propertyKey(key);
               vertexLabel.nullableKey(propertyKey.id());
           }
           vertexLabel.userdata(this.userdata);
           this.graph().updateVertexLabel(vertexLabel);
           return vertexLabel;
       }
   ```
   and then
   ```java
   private void saveSchema(SchemaElement schema, boolean update,
                               Consumer<SchemaElement> updateCallback) {
           // Lock for schema update
           LockUtil.Locks locks = new LockUtil.Locks(this.params().name());
           try {
               locks.lockWrites(LockUtil.hugeType2Group(schema.type()), 
schema.id());
   
               if (updateCallback != null) {
                   // NOTE: Do schema update in the lock block
                   updateCallback.accept(schema);
               }
   
               // System schema just put into SystemSchemaStore in memory
               if (schema.longId() < 0L) {
                   this.systemSchemaStore.add(schema);
                   return;
               }
       ----> // this will then read the cached object with the 
concurrent-unsafe hash set inside it.
               BackendEntry entry = this.serialize(schema);
   
               this.beforeWrite();
   
               if (update) {
                   this.doUpdateIfPresent(entry);
                   // TODO: also support updateIfPresent for index-update
                   this.indexTx.updateNameIndex(schema, false);
               } else {
                   // TODO: support updateIfAbsentProperty (property: label 
name)
                   this.doUpdateIfAbsent(entry);
                   this.indexTx.updateNameIndex(schema, false);
               }
   
               this.afterWrite();
           } finally {
               locks.unlock();
           }
       }
   
   ```
   and then look at MysqlSerializer
   ```java
       @Override
       public BackendEntry writeVertexLabel(VertexLabel vertexLabel) {
           TableBackendEntry entry = newBackendEntry(vertexLabel);
           entry.column(HugeKeys.ID, vertexLabel.id().asLong());
           entry.column(HugeKeys.NAME, vertexLabel.name());
           entry.column(HugeKeys.ID_STRATEGY, vertexLabel.idStrategy().code());
           entry.column(HugeKeys.PROPERTIES,
          ---> here is issue     this.toLongSet(vertexLabel.properties()));
           entry.column(HugeKeys.PRIMARY_KEYS,
           ---> here is issue     this.toLongList(vertexLabel.primaryKeys()));
           entry.column(HugeKeys.NULLABLE_KEYS,
           ---> here is issue             
this.toLongSet(vertexLabel.nullableKeys()));
           entry.column(HugeKeys.INDEX_LABELS,
            ---> here is issue            
this.toLongSet(vertexLabel.indexLabels()));
           this.writeEnableLabelIndex(vertexLabel, entry);
           this.writeUserdata(vertexLabel, entry);
           entry.column(HugeKeys.STATUS, vertexLabel.status().code());
           entry.column(HugeKeys.TTL, vertexLabel.ttl());
           entry.column(HugeKeys.TTL_START_TIME,
                        vertexLabel.ttlStartTime().asLong());
           return entry;
       }
   
   ```
   and then let's look at the toLongSet/to.LongList method. `Collection<Id> 
ids` is actually the cached vertex label's property and may be modified by 
other threads concurrently, so we cannot use ids.size() and for-loop to copy 
the items from collection to array! This will lead to the tailed unwanted 0s in 
the serialized result! 
   ```java
       @Override
       protected Object toLongList(Collection<Id> ids) {
      ---> // this is the bug! 
           long[] values = new long[ids.size()];
           int i = 0;
      ---> // this is the bug!
           for (Id id : ids) {
               values[i++] = id.asLong();
           }
           return JsonUtil.toJson(values);
       }
   ```
   My advice is to add a read lock in the vertex label api to protect this code 
zone.
   
   ### Vertex/Edge example (问题点 / 边数据举例)
   
   _No response_
   
   ### Schema [VertexLabel, EdgeLabel, IndexLabel] (元数据结构)
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@hugegraph.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to