[ https://issues.apache.org/jira/browse/SAMZA-2363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Weiqing Yang updated SAMZA-2363: -------------------------------- Description: I was tried to test the implementation ([https://github.com/apache/samza/pull/1197]) of Samza on K8s on AKS (using the latest Samza (master branch)), and got the error below. The job was trying to store logs and the state in Azure file. (P.S. Samza 1.1 does not hit this issue, and the job built on Samza 1.1 works fine on AKS.) {code:java} 2019-11-04 00:56:00.558 [main] SamzaContainer [INFO] Entering run loop. 2019-11-04 00:56:00.559 [main] ClusterBasedProcessorLifecycleListener [INFO] Container Started 2019-11-04 00:56:57.106 [kafka-producer-network-thread | kafka_producer-wikipedia_application-1] Metadata [INFO] Cluster ID: YNWIzBLFSa2Lg5wehzo7ZA 2019-11-04 00:57:03.920 [main] KafkaSystemAdmin [INFO] Fetching SSP metadata for: [SystemStreamPartition [kafka, wikipedia-stats-changelog, 7]] 2019-11-04 00:57:03.923 [main] KafkaSystemAdmin [INFO] Fetching SSP metadata for: [SystemStreamPartition [kafka, wikipedia-application-1-window-statsWindow, 7]] 2019-11-04 00:57:04.222 [main] RunLoop [ERROR] Task Partition 7 commit failed org.rocksdb.RocksDBException: while link file to /tmp/log/wikipedia-application-1/wikipedia-stats/Partition_7-1572829023927.tmp/000013.sst: /tmp/log/wikipedia-application-1/wikipedia-stats/Partition_7/000013.sst: Operation not supported at org.rocksdb.Checkpoint.createCheckpoint(Native Method) at org.rocksdb.Checkpoint.createCheckpoint(Checkpoint.java:51) at org.apache.samza.storage.kv.RocksDbKeyValueStore.checkpoint(RocksDbKeyValueStore.scala:245) at org.apache.samza.storage.kv.LoggedStore.checkpoint(LoggedStore.scala:125) at org.apache.samza.storage.kv.SerializedKeyValueStore.checkpoint(SerializedKeyValueStore.scala:170) at org.apache.samza.storage.kv.CachedStore.checkpoint(CachedStore.scala:297) at org.apache.samza.storage.kv.NullSafeKeyValueStore.checkpoint(NullSafeKeyValueStore.scala:103) at org.apache.samza.storage.kv.KeyValueStorageEngine$$anonfun$checkpoint$1.apply(KeyValueStorageEngine.scala:215) at org.apache.samza.storage.kv.KeyValueStorageEngine$$anonfun$checkpoint$1.apply(KeyValueStorageEngine.scala:212) at org.apache.samza.util.TimerUtil$class.updateTimer(TimerUtil.scala:37) at org.apache.samza.storage.kv.KeyValueStorageEngine.updateTimer(KeyValueStorageEngine.scala:37) at org.apache.samza.storage.kv.KeyValueStorageEngine.checkpoint(KeyValueStorageEngine.scala:212) at org.apache.samza.storage.TransactionalStateTaskStorageManager$$anonfun$2.apply(TransactionalStateTaskStorageManager.scala:67) at org.apache.samza.storage.TransactionalStateTaskStorageManager$$anonfun$2.apply(TransactionalStateTaskStorageManager.scala:66) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.Iterator$class.foreach(Iterator.scala:893) at scala.collection.AbstractIterator.foreach(Iterator.scala:1336) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104) at org.apache.samza.storage.TransactionalStateTaskStorageManager.checkpoint(TransactionalStateTaskStorageManager.scala:66) at org.apache.samza.container.TaskInstance.commit(TaskInstance.scala:267) at org.apache.samza.container.RunLoop$AsyncTaskWorker$5.run(RunLoop.java:547) at org.apache.samza.container.RunLoop$AsyncTaskWorker.commit(RunLoop.java:566) at org.apache.samza.container.RunLoop$AsyncTaskWorker.run(RunLoop.java:432) at org.apache.samza.container.RunLoop$AsyncTaskWorker.access$300(RunLoop.java:357) at org.apache.samza.container.RunLoop.runTasks(RunLoop.java:244) at org.apache.samza.container.RunLoop.run(RunLoop.java:176) at org.apache.samza.container.SamzaContainer.run(SamzaContainer.scala:768) at org.apache.samza.runtime.ContainerLaunchUtil.run(ContainerLaunchUtil.java:139) at org.apache.samza.runtime.ContainerLaunchUtil.run(ContainerLaunchUtil.java:82) at org.apache.samza.runtime.LocalContainerRunner.main(LocalContainerRunner.java:80) 2019-11-04 00:57:04.223 [main] RunLoop [ERROR] Caught throwable and stopping run loop org.rocksdb.RocksDBException: while link file to /tmp/log/wikipedia-application-1/wikipedia-stats/Partition_7-1572829023927.tmp/000013.sst: /tmp/log/wikipedia-application-1/wikipedia-stats/Partition_7/000013.sst: Operation not supported at org.rocksdb.Checkpoint.createCheckpoint(Native Method) {code} was: I was tried to test the implementation ([https://github.com/apache/samza/pull/1197]) of Samza on K8s on AKS (using the latest Samza (master branch)), and got the error below. The job was trying to store logs and the state in Azure file. (P.S. the implementation works fine with Samza 1.1.) {code:java} 2019-11-04 00:56:00.558 [main] SamzaContainer [INFO] Entering run loop. 2019-11-04 00:56:00.559 [main] ClusterBasedProcessorLifecycleListener [INFO] Container Started 2019-11-04 00:56:57.106 [kafka-producer-network-thread | kafka_producer-wikipedia_application-1] Metadata [INFO] Cluster ID: YNWIzBLFSa2Lg5wehzo7ZA 2019-11-04 00:57:03.920 [main] KafkaSystemAdmin [INFO] Fetching SSP metadata for: [SystemStreamPartition [kafka, wikipedia-stats-changelog, 7]] 2019-11-04 00:57:03.923 [main] KafkaSystemAdmin [INFO] Fetching SSP metadata for: [SystemStreamPartition [kafka, wikipedia-application-1-window-statsWindow, 7]] 2019-11-04 00:57:04.222 [main] RunLoop [ERROR] Task Partition 7 commit failed org.rocksdb.RocksDBException: while link file to /tmp/log/wikipedia-application-1/wikipedia-stats/Partition_7-1572829023927.tmp/000013.sst: /tmp/log/wikipedia-application-1/wikipedia-stats/Partition_7/000013.sst: Operation not supported at org.rocksdb.Checkpoint.createCheckpoint(Native Method) at org.rocksdb.Checkpoint.createCheckpoint(Checkpoint.java:51) at org.apache.samza.storage.kv.RocksDbKeyValueStore.checkpoint(RocksDbKeyValueStore.scala:245) at org.apache.samza.storage.kv.LoggedStore.checkpoint(LoggedStore.scala:125) at org.apache.samza.storage.kv.SerializedKeyValueStore.checkpoint(SerializedKeyValueStore.scala:170) at org.apache.samza.storage.kv.CachedStore.checkpoint(CachedStore.scala:297) at org.apache.samza.storage.kv.NullSafeKeyValueStore.checkpoint(NullSafeKeyValueStore.scala:103) at org.apache.samza.storage.kv.KeyValueStorageEngine$$anonfun$checkpoint$1.apply(KeyValueStorageEngine.scala:215) at org.apache.samza.storage.kv.KeyValueStorageEngine$$anonfun$checkpoint$1.apply(KeyValueStorageEngine.scala:212) at org.apache.samza.util.TimerUtil$class.updateTimer(TimerUtil.scala:37) at org.apache.samza.storage.kv.KeyValueStorageEngine.updateTimer(KeyValueStorageEngine.scala:37) at org.apache.samza.storage.kv.KeyValueStorageEngine.checkpoint(KeyValueStorageEngine.scala:212) at org.apache.samza.storage.TransactionalStateTaskStorageManager$$anonfun$2.apply(TransactionalStateTaskStorageManager.scala:67) at org.apache.samza.storage.TransactionalStateTaskStorageManager$$anonfun$2.apply(TransactionalStateTaskStorageManager.scala:66) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.Iterator$class.foreach(Iterator.scala:893) at scala.collection.AbstractIterator.foreach(Iterator.scala:1336) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104) at org.apache.samza.storage.TransactionalStateTaskStorageManager.checkpoint(TransactionalStateTaskStorageManager.scala:66) at org.apache.samza.container.TaskInstance.commit(TaskInstance.scala:267) at org.apache.samza.container.RunLoop$AsyncTaskWorker$5.run(RunLoop.java:547) at org.apache.samza.container.RunLoop$AsyncTaskWorker.commit(RunLoop.java:566) at org.apache.samza.container.RunLoop$AsyncTaskWorker.run(RunLoop.java:432) at org.apache.samza.container.RunLoop$AsyncTaskWorker.access$300(RunLoop.java:357) at org.apache.samza.container.RunLoop.runTasks(RunLoop.java:244) at org.apache.samza.container.RunLoop.run(RunLoop.java:176) at org.apache.samza.container.SamzaContainer.run(SamzaContainer.scala:768) at org.apache.samza.runtime.ContainerLaunchUtil.run(ContainerLaunchUtil.java:139) at org.apache.samza.runtime.ContainerLaunchUtil.run(ContainerLaunchUtil.java:82) at org.apache.samza.runtime.LocalContainerRunner.main(LocalContainerRunner.java:80) 2019-11-04 00:57:04.223 [main] RunLoop [ERROR] Caught throwable and stopping run loop org.rocksdb.RocksDBException: while link file to /tmp/log/wikipedia-application-1/wikipedia-stats/Partition_7-1572829023927.tmp/000013.sst: /tmp/log/wikipedia-application-1/wikipedia-stats/Partition_7/000013.sst: Operation not supported at org.rocksdb.Checkpoint.createCheckpoint(Native Method) {code} > Rock DB failed to create hard link on Azure file > ------------------------------------------------ > > Key: SAMZA-2363 > URL: https://issues.apache.org/jira/browse/SAMZA-2363 > Project: Samza > Issue Type: Bug > Components: kv, kv-store > Reporter: Weiqing Yang > Priority: Major > > I was tried to test the implementation > ([https://github.com/apache/samza/pull/1197]) of Samza on K8s on AKS (using > the latest Samza (master branch)), and got the error below. The job was > trying to store logs and the state in Azure file. > (P.S. Samza 1.1 does not hit this issue, and the job built on Samza 1.1 works > fine on AKS.) > {code:java} > 2019-11-04 00:56:00.558 [main] SamzaContainer [INFO] Entering run loop. > 2019-11-04 00:56:00.559 [main] ClusterBasedProcessorLifecycleListener [INFO] > Container Started > 2019-11-04 00:56:57.106 [kafka-producer-network-thread | > kafka_producer-wikipedia_application-1] Metadata [INFO] Cluster ID: > YNWIzBLFSa2Lg5wehzo7ZA > 2019-11-04 00:57:03.920 [main] KafkaSystemAdmin [INFO] Fetching SSP metadata > for: [SystemStreamPartition [kafka, wikipedia-stats-changelog, 7]] > 2019-11-04 00:57:03.923 [main] KafkaSystemAdmin [INFO] Fetching SSP metadata > for: [SystemStreamPartition [kafka, > wikipedia-application-1-window-statsWindow, 7]] > 2019-11-04 00:57:04.222 [main] RunLoop [ERROR] Task Partition 7 commit failed > org.rocksdb.RocksDBException: while link file to > /tmp/log/wikipedia-application-1/wikipedia-stats/Partition_7-1572829023927.tmp/000013.sst: > /tmp/log/wikipedia-application-1/wikipedia-stats/Partition_7/000013.sst: > Operation not supported > at org.rocksdb.Checkpoint.createCheckpoint(Native Method) > at org.rocksdb.Checkpoint.createCheckpoint(Checkpoint.java:51) > at > org.apache.samza.storage.kv.RocksDbKeyValueStore.checkpoint(RocksDbKeyValueStore.scala:245) > at > org.apache.samza.storage.kv.LoggedStore.checkpoint(LoggedStore.scala:125) > at > org.apache.samza.storage.kv.SerializedKeyValueStore.checkpoint(SerializedKeyValueStore.scala:170) > at > org.apache.samza.storage.kv.CachedStore.checkpoint(CachedStore.scala:297) > at > org.apache.samza.storage.kv.NullSafeKeyValueStore.checkpoint(NullSafeKeyValueStore.scala:103) > at > org.apache.samza.storage.kv.KeyValueStorageEngine$$anonfun$checkpoint$1.apply(KeyValueStorageEngine.scala:215) > at > org.apache.samza.storage.kv.KeyValueStorageEngine$$anonfun$checkpoint$1.apply(KeyValueStorageEngine.scala:212) > at org.apache.samza.util.TimerUtil$class.updateTimer(TimerUtil.scala:37) > at > org.apache.samza.storage.kv.KeyValueStorageEngine.updateTimer(KeyValueStorageEngine.scala:37) > at > org.apache.samza.storage.kv.KeyValueStorageEngine.checkpoint(KeyValueStorageEngine.scala:212) > at > org.apache.samza.storage.TransactionalStateTaskStorageManager$$anonfun$2.apply(TransactionalStateTaskStorageManager.scala:67) > at > org.apache.samza.storage.TransactionalStateTaskStorageManager$$anonfun$2.apply(TransactionalStateTaskStorageManager.scala:66) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at scala.collection.Iterator$class.foreach(Iterator.scala:893) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1336) > at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) > at scala.collection.AbstractIterable.foreach(Iterable.scala:54) > at > scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) > at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104) > at > org.apache.samza.storage.TransactionalStateTaskStorageManager.checkpoint(TransactionalStateTaskStorageManager.scala:66) > at > org.apache.samza.container.TaskInstance.commit(TaskInstance.scala:267) > at > org.apache.samza.container.RunLoop$AsyncTaskWorker$5.run(RunLoop.java:547) > at > org.apache.samza.container.RunLoop$AsyncTaskWorker.commit(RunLoop.java:566) > at > org.apache.samza.container.RunLoop$AsyncTaskWorker.run(RunLoop.java:432) > at > org.apache.samza.container.RunLoop$AsyncTaskWorker.access$300(RunLoop.java:357) > at org.apache.samza.container.RunLoop.runTasks(RunLoop.java:244) > at org.apache.samza.container.RunLoop.run(RunLoop.java:176) > at > org.apache.samza.container.SamzaContainer.run(SamzaContainer.scala:768) > at > org.apache.samza.runtime.ContainerLaunchUtil.run(ContainerLaunchUtil.java:139) > at > org.apache.samza.runtime.ContainerLaunchUtil.run(ContainerLaunchUtil.java:82) > at > org.apache.samza.runtime.LocalContainerRunner.main(LocalContainerRunner.java:80) > 2019-11-04 00:57:04.223 [main] RunLoop [ERROR] Caught throwable and stopping > run loop > org.rocksdb.RocksDBException: while link file to > /tmp/log/wikipedia-application-1/wikipedia-stats/Partition_7-1572829023927.tmp/000013.sst: > /tmp/log/wikipedia-application-1/wikipedia-stats/Partition_7/000013.sst: > Operation not supported > at org.rocksdb.Checkpoint.createCheckpoint(Native Method) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)