Github user dakirsa commented on the issue:
https://github.com/apache/spark/pull/19439
@hhbyyh, @thunterdb
> Not sure about the reason to include "origin" info into the image data.
Based on my experience, path info
> serves better as a separate column in the DataFrame. (E.g. prediction)
One of the main reasons is MLlib pipelines: transformers/estimators work on
a single dataframe column; so it is much easier when "origin" is a part of this
column too.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]