Maybe DataImportHandler should subclass ContentStreamHandlerBase, which calls #finish already. This would mean we implement a new ContentStreamLoader. This would allow DIH to hand the streams off as either data sources or data to entities, right? This is where we want to head with Tika integration into DIH, methinks.
Thoughts? Erik