[ 
https://issues.apache.org/jira/browse/KAFKA-13251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chaos updated KAFKA-13251:
--------------------------
    Description: 
Disk error occurred in broker(=42),and then Shrinking ISR to itself.
 so why Shrinking ISR to an error broker?

i.e.  why not "Shrinking ISR from 55,42 to 55" but "Shrinking ISR from 55,42 to 
42".

note:

other partition(110) shrink correctly.

 

kafka logs:
 broker42:

[2021-08-26 20:20:55,640] ERROR [ReplicaManager broker=42] Error processing 
fetch with max size 1048576 from consumer on partition topic_xx-123: 
(fetchOffset=11061228956, logStartOffset=-1, maxBytes=1048576, 
currentLeaderEpoch=Optional.empty) (kafka.server.ReplicaManager)
 org.apache.kafka.common.errors.CorruptRecordException: Found record size 0 
smaller than minimum record overhead (14) in file 
/data4/kafka-logs/topic_xx-123/00000000011060934646.log.
 [2021-08-26 20:20:55,640] ERROR Error while appending records to topic_xx-123 
in dir /data4/kafka-logs (kafka.server.LogDirFailureChannel)
 [2021-08-26 20:20:55,645] ERROR Error while deleting segments for topic_xx-123 
in dir /data4/kafka-logs (kafka.server.LogDirFailureChannel)
 java.nio.file.FileSystemException: 
/data4/kafka-logs/topic_xx-123/00000000011040402299.log -> 
/data4/kafka-logs/topic_xx-123/00000000011040402299.log.deleted: Read-only file 
system
 Suppressed: java.nio.file.FileSystemException: 
/data4/kafka-logs/topic_xx-123/00000000011040402299.log -> 
/data4/kafka-logs/topic_xx-123/00000000011040402299.log.deleted: Read-only file 
system
 [2021-08-26 20:20:55,644] ERROR Error while appending records to topic_xx-123 
in dir /data4/kafka-logs (kafka.server.LogDirFailureChannel)
 [2021-08-26 20:20:55,652] INFO [Partition topic_xx-123 broker=42] Shrinking 
ISR from 55,42 to 42. Leader: (highWatermark: 11061228956, endOffset: 
11061228965). Out of sync replicas: (brokerId: 55, endOffset: 11061228956). 
(kafka.cluster.Partition)
  

broker55:
 [2021-08-26 20:20:32,456] WARN [ReplicaFetcher replicaId=55, leaderId=42, 
fetcherId=0] Error in response for fetch request (type=FetchRequest, 
replicaId=55, maxWait=500, minBytes=1, maxBytes=10485760, fetchData={}, 
isolationLevel=READ_UNCOMMITTED, toForget=, metadata=(sessionId=830774713, 
epoch=1562806014), rackId=) (kafka.server.ReplicaFetcherThread)

[2021-08-26 20:20:43,503] INFO [Partition topic_xxx-110 broker=55] Shrinking 
ISR from 55,42 to 55. Leader: (highWatermark: 11061384367, endOffset: 
11061388788). Out of sync replicas: (brokerId: 42, endOffset: 11061384367). 
(kafka.cluster.Partition)

 

disk error on broker42 is:
 Aug 26 20:20:55 kernel: sd 0:2:5:0: [sdf] tag#33 FAILED Result: 
hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK

  was:
Disk error occurred in broker(=42),and then Shrinking ISR to itself.
 so why Shrinking ISR to an error broker?

i.e.  why not "Shrinking ISR from 55,42 to 55" but "Shrinking ISR from 55,42 to 
42".


 kafka logs:
 broker42:

[2021-08-26 20:20:55,640] ERROR [ReplicaManager broker=42] Error processing 
fetch with max size 1048576 from consumer on partition topic_xx-123: 
(fetchOffset=11061228956, logStartOffset=-1, maxBytes=1048576, 
currentLeaderEpoch=Optional.empty) (kafka.server.ReplicaManager)
 org.apache.kafka.common.errors.CorruptRecordException: Found record size 0 
smaller than minimum record overhead (14) in file 
/data4/kafka-logs/topic_xx-123/00000000011060934646.log.
 [2021-08-26 20:20:55,640] ERROR Error while appending records to topic_xx-123 
in dir /data4/kafka-logs (kafka.server.LogDirFailureChannel)
 [2021-08-26 20:20:55,645] ERROR Error while deleting segments for topic_xx-123 
in dir /data4/kafka-logs (kafka.server.LogDirFailureChannel)
 java.nio.file.FileSystemException: 
/data4/kafka-logs/topic_xx-123/00000000011040402299.log -> 
/data4/kafka-logs/topic_xx-123/00000000011040402299.log.deleted: Read-only file 
system
 Suppressed: java.nio.file.FileSystemException: 
/data4/kafka-logs/topic_xx-123/00000000011040402299.log -> 
/data4/kafka-logs/topic_xx-123/00000000011040402299.log.deleted: Read-only file 
system
 [2021-08-26 20:20:55,644] ERROR Error while appending records to topic_xx-123 
in dir /data4/kafka-logs (kafka.server.LogDirFailureChannel)
 [2021-08-26 20:20:55,652] INFO [Partition topic_xx-123 broker=42] Shrinking 
ISR from 55,42 to 42. Leader: (highWatermark: 11061228956, endOffset: 
11061228965). Out of sync replicas: (brokerId: 55, endOffset: 11061228956). 
(kafka.cluster.Partition)
  

broker55:
 [2021-08-26 20:20:32,456] WARN [ReplicaFetcher replicaId=55, leaderId=42, 
fetcherId=0] Error in response for fetch request (type=FetchRequest, 
replicaId=55, maxWait=500, minBytes=1, maxBytes=10485760, fetchData={}, 
isolationLevel=READ_UNCOMMITTED, toForget=, metadata=(sessionId=830774713, 
epoch=1562806014), rackId=) (kafka.server.ReplicaFetcherThread)

disk error on broker42 is:
 Aug 26 20:20:55 kernel: sd 0:2:5:0: [sdf] tag#33 FAILED Result: 
hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK


> 2.4.0
> -----
>
>                 Key: KAFKA-13251
>                 URL: https://issues.apache.org/jira/browse/KAFKA-13251
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 2.4.0
>         Environment: linux 4.1.0
>            Reporter: chaos
>            Priority: Major
>
> Disk error occurred in broker(=42),and then Shrinking ISR to itself.
>  so why Shrinking ISR to an error broker?
> i.e.  why not "Shrinking ISR from 55,42 to 55" but "Shrinking ISR from 55,42 
> to 42".
> note:
> other partition(110) shrink correctly.
>  
> kafka logs:
>  broker42:
> [2021-08-26 20:20:55,640] ERROR [ReplicaManager broker=42] Error processing 
> fetch with max size 1048576 from consumer on partition topic_xx-123: 
> (fetchOffset=11061228956, logStartOffset=-1, maxBytes=1048576, 
> currentLeaderEpoch=Optional.empty) (kafka.server.ReplicaManager)
>  org.apache.kafka.common.errors.CorruptRecordException: Found record size 0 
> smaller than minimum record overhead (14) in file 
> /data4/kafka-logs/topic_xx-123/00000000011060934646.log.
>  [2021-08-26 20:20:55,640] ERROR Error while appending records to 
> topic_xx-123 in dir /data4/kafka-logs (kafka.server.LogDirFailureChannel)
>  [2021-08-26 20:20:55,645] ERROR Error while deleting segments for 
> topic_xx-123 in dir /data4/kafka-logs (kafka.server.LogDirFailureChannel)
>  java.nio.file.FileSystemException: 
> /data4/kafka-logs/topic_xx-123/00000000011040402299.log -> 
> /data4/kafka-logs/topic_xx-123/00000000011040402299.log.deleted: Read-only 
> file system
>  Suppressed: java.nio.file.FileSystemException: 
> /data4/kafka-logs/topic_xx-123/00000000011040402299.log -> 
> /data4/kafka-logs/topic_xx-123/00000000011040402299.log.deleted: Read-only 
> file system
>  [2021-08-26 20:20:55,644] ERROR Error while appending records to 
> topic_xx-123 in dir /data4/kafka-logs (kafka.server.LogDirFailureChannel)
>  [2021-08-26 20:20:55,652] INFO [Partition topic_xx-123 broker=42] Shrinking 
> ISR from 55,42 to 42. Leader: (highWatermark: 11061228956, endOffset: 
> 11061228965). Out of sync replicas: (brokerId: 55, endOffset: 11061228956). 
> (kafka.cluster.Partition)
>   
> broker55:
>  [2021-08-26 20:20:32,456] WARN [ReplicaFetcher replicaId=55, leaderId=42, 
> fetcherId=0] Error in response for fetch request (type=FetchRequest, 
> replicaId=55, maxWait=500, minBytes=1, maxBytes=10485760, fetchData={}, 
> isolationLevel=READ_UNCOMMITTED, toForget=, metadata=(sessionId=830774713, 
> epoch=1562806014), rackId=) (kafka.server.ReplicaFetcherThread)
> [2021-08-26 20:20:43,503] INFO [Partition topic_xxx-110 broker=55] Shrinking 
> ISR from 55,42 to 55. Leader: (highWatermark: 11061384367, endOffset: 
> 11061388788). Out of sync replicas: (brokerId: 42, endOffset: 11061384367). 
> (kafka.cluster.Partition)
>  
> disk error on broker42 is:
>  Aug 26 20:20:55 kernel: sd 0:2:5:0: [sdf] tag#33 FAILED Result: 
> hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to