New reply on DataCleaner's online discussion forum 
(http://datacleaner.org/forum):

Kasper Sørensen replied to subject 'ValueDistribution analysis on CSV file has 
results that has many <count=> tags'

-------------------

As for reasoning: It means that there are many values which appear N times. For 
instance these lines:

<count=2>       1070
<count=3>       249

Means that you have 1070 values which are duplicates and 249 that are 
triplicates.  You can drill down to see which values it is. In a sense it's 
just an extension of the "unique" label. It's a way to avoid the report to 
become too long.

Why I say it is a corner case is because of thresholds in the grouping here. It 
seems that in your case _maybe_ the report would still be acceptable to look at 
if we did not do this trick. But had you for instance double or three times the 
records (or even just a million records or so) then I would feel more confident 
saying "you do want this feature, even if you don't know it".

-------------------

View the topic online to reply - go to 
http://datacleaner.org/topic/1084/ValueDistribution-analysis-on-CSV-file-has-results-that-has-many-%3Ccount%3D%3E-tags

-- 
You received this message because you are subscribed to the Google Groups 
"DataCleaner-notify" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/datacleaner-notify.
For more options, visit https://groups.google.com/d/optout.

Reply via email to