RaulGracia commented on issue #3466: URL: https://github.com/apache/bookkeeper/issues/3466#issuecomment-1235540195
Thanks @eolivelli, reposting the answer in Slack for better visibility: https://pravega-io.slack.com/archives/C0151LSM46L/p1662115749173509 > is the ledger znode present on ZK ? The cluster is not in state anymore, but if the Bookkeeper Client gets a `BKException$BKNoSuchLedgerExistsException`, my understanding is that it should not be there. However, I see that the ledger `3246 `is created at `2022-07-20 14:45:25,706` and then we can see in Bookie logs some queries on these ledger metadata ZNodes: ``` 2-bookie-9c88965c-2022-07-20T14-55-24Z.log.gz:2022-07-20T14:51:48.716729331+00:00 stdout F 14:51:48,716 DEBUG Reading reply session id: 0x3016baeab7f000f, packet:: clientPath:/pravega/pravega/bookkeeper/ledgers/00/0000 serverPath:/pravega/pravega/bookkeeper/ledgers/00/0000 finished:false header:: 36287,8 replyHeader:: 36287,38654709412,0 request:: '/pravega/pravega/bookkeeper/ledgers/00/0000,F response:: v{'L3195,'L3070,'L3071,'L3192,'L2784,'L3078,'L3199,'L3079,'L3076,'L3197,'L3077,'L3198,'L3081,'L0377,'L3089,'L3087,'L0372,'L3052,'L3173,'L3053,'L3295,'L3050,'L3171,'L3172,'L3293,'L3290,'L3170,'L3291,'L0463,'L3058,'L0464,'L3059,'L3056,'L3177,'L3057,'L3178,'L0461,'L3054,'L3175,'L3055,'L3176,'L3297,'L3063,'L3184,'L3064,'L3061,'L3182,'L3062,'L3183,'L3181,'L3069,'L3067,'L3188,'L3068,'L3065,'L3066,'L3310,'L1814,'L3317,'L3203,'L3324,'L3322,'L3202,'L3323,'L3200,'L0734,'L1705,'L0737,'L3209,'L0736,'L3207,'L3328,'L3208,'L3329,'L3205,'L0732,'L3327,'L0739,'L0738,'L3096,'L3094,'L3095,'L3092,' L3093,'L3091,'L0388,'L0702,'L0704,'L0703,'L2327,'L0709,'L0706,'L0705,'L0708,'L0707,'L0390,'L3302,'L2452,'L0713,'L0712,'L3308,'L0714,'L3306,'L3307,'L0711,'L0710,'L3305,'L3115,'L3236,'L3116,'L3237,'L0762,'L3113,'L3234,'L3114,'L3233,'L3230,'L3110,'L3231,'L0768,'L0404,'L0767,'L0769,'L0764,'L3119,'L0763,'L3238,'L0765,'L3118,'L0771,'L3126,'L1983,'L3127,'L3248,'L3245,'L3125,'L3246,'L3122,'L3243,'L3123,'L3244,'L3120,'L3000,'L0418,'L2957,'L3009,'L2955,'L3007,'L3214,'L0740,'L2485,'L3211,'L3330,'L0745,'L3218,'L0741,'L0744,'L3104,'L3105,'L3226,'L3102,'L3103,'L3224,'L3101,'L2130,'L0757,'L0756,'L2816,'L0759,'L0758,'L3108,'L3107,'L3030,'L3151,'L3031,'L3152,'L3273,'L3150,'L3038,'L3159,'L2984,'L3278,'L2982,'L3037,'L3279,'L3155,'L2980,'L3156,'L3277,'L2064,'L3032,'L3153,'L3274,'L3033,'L3154,'L3275,'L2989,'L0444,'L2988,'L0447,'L2985,'L2986,'L2073,'L3041,'L3162,'L3283,'L3042,'L3163,'L3160,'L3281,'L3040,'L3161,'L3280,'L0694,'L0696,'L2992,'L3047,'L3048,'L3169,'L3166,'L3167,'L0692,'L3043,'L3164,'L3165,'L29 99,'L2996,'L3130,'L3137,'L3017,'L3014,'L3135,'L3015,'L3136,'L3012,'L3134,'L3010,'L3131,'L3252,'L3132,'L2969,'L2966,'L3018,'L3139,'L3019,'L3140,'L3020,'L3141,'L3148,'L3269,'L0671,'L2610,'L2973,'L3149,'L3025,'L2971,'L3147,'L3024,'L3145,'L3142,'L3143,'L2979,'L2735,'L2974,'L3029} 2-bookie-9c88965c-2022-07-20T14-55-24Z.log.gz:2022-07-20T14:51:48.736027964+00:00 stdout F 14:51:48,735 DEBUG Reading reply session id: 0x3016baeab7f000f, packet:: clientPath:/pravega/pravega/bookkeeper/ledgers/00/0000/L3246 serverPath:/pravega/pravega/bookkeeper/ledgers/00/0000/L3246 finished:false header:: 36521,4 replyHeader:: 36521,38654709412,0 request:: '/pravega/pravega/bookkeeper/ledgers/00/0000/L3246,F response:: #426f6f6b69654d65746164617461466f726d617456657273696f6e933a7c8210218ffffff9f92002833238a19626f6f6b6b65657065722d626f6f6b69652d322d3131353830a19626f6f6b6b65657065722d626f6f6b69652d302d32383831361003834204825a16ab6170706c69636174696f6e127507261766567615a15af426f6f6b4b65657065724c6f6749641223133600,s{38654707364,38654708391,1658328325290,1658328491164,2,0,0,0,155,0,38654707364} ``` I don't know if the fact that this request queries `/pravega/pravega/bookkeeper/ledgers/00/0000/L3246 `and gets some actual payload in response means that it actually exists in Zookeeper. > which are your replication parameters: ES, WQ, AQ ? In this experiment, the configuration is `ensembleSize=2, writeQuorumSize=2, ackQuorumSize=2`. > how many bookies do you have ? The Bookkeeper service is configured to keep 4 Bookies. > how many entries are supposed to be in the ledger ? only 1 ? Pravega rolls over ledger when they reach their max size (1GB default) or when there is some container recovery (due to fencing). So, this particular ledger was created but contained 0 entries at the moment when the issue happened. It would have contained many more in case the issue wouldn't had appeared. > at a first glance it looks to me that N bookies are answering that they do not know the ledger Could be, according to this log, the ledger should be created using 2 Bookies: `2022-07-20T14:45:25.706687124+00:00 stdout F 2022-07-20 14:45:25,706 7043990 [ZKC-connect-executor-0-EventThread] INFO o.a.b.client.LedgerCreateOp - Ensemble: [bookkeeper-bookie-2-11580, bookkeeper-bookie-0-28816] for ledger: 3246` But in the Bookie logs, I mainly see `2-bookie-9c88965c` with logs related to ledger `3246`. The test inducing network drops may be exposing this potential issue that is hard to discover otherwise. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
