I think we should NOT reinvent the terms for data quality industry.
'measured and reference' are new term, but they still have the same issue
as 'source and target', too abstract to understanding.
I prefer use source and target,
but in implementation, we can name it more elaborately, such as
On Mon, Apr 9, 2018 at 11:25 AM, Lionel Liu <lionel...@apache.org> wrote:
> Hi all,
> Recently, we've received some questions about accuracy defination. In my
> opinion, accuracy measures one data source with another one which is the
> In the theoretic defination of accuracy, we call the "truth" data source as
> "source", and the data source to be measured as "target". Actually,
> "source" and "target" are not standard name of defination.
> However, in the implementation, we did it in the opposite way, the data
> source to be measured was called "source", and the "truth" data source was
> called "target", seems like the relationship of "compare source with
> I want to modify the implementation concept to be align with the theoretic
> defination now, to reduce the users' and developers' confuse. But I doubt
> "target" and "source" are not clear enough to describe such relationship,
> which might be still confusing. When user needs to choose data sources as
> "source" and "target", he will still ask such questions: which is source,
> and which is target.
> I think we could find some better names instead of "target" and "source",
> to reduce such confuse. For example, I think the name of "measured" and
> "reference" would be much clearer.
> How do you think about it? Or you have some better names?