I have a Trident topology that processes tuples from an opaque Kafka spout. Three bolts write different forms of processed output. Each form has a different record size. I have the same parallelism and rotation policy applied to each of the three outputs, but this results in two streams writing very small files to HDFS. I’m thinking of setting different parallelism for each of the outputs. Is there any reason not to have different parallelism settings across output bolts?
