Ben, where did the strings like: "amazon.co.jp" come from? did you grab 
the domain names? or were these all text strings found in the field?

kc

On 2/22/12 4:30 AM, Ben Companjen wrote:
> Hi all,
>
> Last night I ran a script to count the identifiers found in Edition
> records in the dump of January 31st.
>
> It counted 173 identifiers, including ISBN 10 and 13, ocaid, oclc
> numbers and all the variations of the identifiers in the list in the
> edit form. There is a lot of junk in this list (starting with "1sbn",
> "Select", "isbn", "isbn13"..), but more effort is needed to find the
> records that contain the junk and clean it up. It appears that it
> contains classifications too - just like the edit form does?
>
> The CSV list is at https://gist.github.com/1884546 - the second column
> contains the total number of occurrences of the id (counting all the
> instances in each record), the third column is the number of records
> that contain the id.
>
> Regards,
>
> Ben
> _______________________________________________
> Ol-tech mailing list
> [email protected]
> http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
> To unsubscribe from this mailing list, send email to 
> [email protected]

-- 
Karen Coyle
[email protected] http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet
_______________________________________________
Ol-tech mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
To unsubscribe from this mailing list, send email to 
[email protected]

Reply via email to