I think there are a few details we need to discuss. how frequently a source should update its metrics? For example, if file source needs to report size metrics per row, it'll be super slow.
what metrics a source should report? data size? numFiles? read time? shall we show metrics in SQL web UI as well? On Fri, Jan 17, 2020 at 3:07 PM Sandeep Katta < sandeep0102.opensou...@gmail.com> wrote: > Hi Devs, > > Currently DS V2 does not update any input metrics. SPARK-30362 aims at > solving this problem. > > We can have the below approach. Have marker interface let's say > "ReportMetrics" > > If the DataSource Implements this interface, then it will be easy to > collect the metrics. > > For e.g. FilePartitionReaderFactory can support metrics. > > So it will be easy to collect the metrics if FilePartitionReaderFactory > implements ReportMetrics > > > Please let me know the views, or even if we want to have new solution or > design. >