Renkai opened a new issue #8591: URL: https://github.com/apache/pulsar/issues/8591
Currently, if the data in pulsar was offloaded to the second storage layer, data can still exists in bookkeeper for a period of time, but the client will directly read data from the second layer. This may lead to several problems: - Read from second layer have different performance characteristics, which may lead wrong estimate from users if they didn't know which layer they are reading. - The second layer may be managed by another team rather than Pulsar management team(for example, a independent HDFS management team), they may have independent quota or authority policy to users. - The second layer storage can be infinite in theory, if user set cursor to an error time in accident, it will cause a lot of resource waste. So it's better to make data source configurable if data exists in both layer. Maybe the below options are enough: - first layer only - first layer first - second layer only - second layer first We can make `second layer fist` as the default value, which will result to the same behavior with current version. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
