Re: [CODE4LIB] Name-inversion software library?

2017-01-17 Thread Timothy Hill
Thanks for this, Owen. Obviously this will need a little work to develop
into something suitable for my particular use-case - but then, that's what
open source is all about ...

Thanks again,

Tim

On 13 January 2017 at 12:53, Owen Stephens  wrote:

> So just out of curiosity I ran 2500 author names from the DOAJ through
> this library (I used a version someone has kindly wrapped as a webservice
> http://nameparse.herokuapp.com/?name=Firstname+Surname). The names were
> just some I had handy so no real attempt to challenge the software.
>
> In general it seemed to do pretty well, but it isn’t perfect. In
> particular two part given names or two part family names where the parts
> are separated by a space end up with part of the name in the ‘middle name’.
> This may not matter too much to you in cases where this affects the given
> name, because you’ll end up with the same output string if you format as
> {family name}, {first name} {middle name}. However in cases where the
> surname is split by a space (as it is for my kids) then you end up with a
> problem - e.g.:
>
> Jane Bloggs Doe - where the surname is ‘Bloggs Doe’, would end up being
> converted to: Doe, Jane Bloggs instead of Bloggs Doe, Jane
>
> I tend to use OpenRefine to do this kind of work and this allows you to do
> lookups on webservices such as the one I’ve used - so this is a pretty
> useful addition to my toolset - thanks for asking the question!
>
> Owen
>
>
> Owen Stephens
> Owen Stephens Consulting
> Web: http://www.ostephens.com
> Email: o...@ostephens.com
> Telephone: 0121 288 6936
>
> > On 13 Jan 2017, at 11:34, Timothy Hill  wrote:
> >
> > Please excuse the naive way this question is formulated: I'm sure the
> > Information & Library Science community has formal terms for what I'm
> > attempting to do, but unfortunately I don't know what they are.
> >
> > The problem I'm trying to solve is that I have a bunch of author names
> (for
> > example, 'Charles Dickens') that I need to reformat into standard
> catalogue
> > order ('Dickens, Charles'). Obviously the example given is trivial, but
> of
> > course this can get quite complex depending on the addition of titles and
> > honorifics.
> >
> > Is anyone aware of a software library to perform this kind of conversion?
> > The programming language used is not terribly important, though Java or
> > Python would be preferable.
> >
> > In ideal world the library would deal with the different conventions used
> > in different languages and by different institutions - but anything would
> > be better than the current split-on-comma approach I'm using right now.
> >
> > Thanks,
> >
> > Timothy Hill
>


Re: [CODE4LIB] Name-inversion software library?

2017-01-13 Thread Owen Stephens
So just out of curiosity I ran 2500 author names from the DOAJ through this 
library (I used a version someone has kindly wrapped as a webservice 
http://nameparse.herokuapp.com/?name=Firstname+Surname). The names were just 
some I had handy so no real attempt to challenge the software.

In general it seemed to do pretty well, but it isn’t perfect. In particular two 
part given names or two part family names where the parts are separated by a 
space end up with part of the name in the ‘middle name’. This may not matter 
too much to you in cases where this affects the given name, because you’ll end 
up with the same output string if you format as {family name}, {first name} 
{middle name}. However in cases where the surname is split by a space (as it is 
for my kids) then you end up with a problem - e.g.:

Jane Bloggs Doe - where the surname is ‘Bloggs Doe’, would end up being 
converted to: Doe, Jane Bloggs instead of Bloggs Doe, Jane

I tend to use OpenRefine to do this kind of work and this allows you to do 
lookups on webservices such as the one I’ve used - so this is a pretty useful 
addition to my toolset - thanks for asking the question!

Owen


Owen Stephens
Owen Stephens Consulting
Web: http://www.ostephens.com
Email: o...@ostephens.com
Telephone: 0121 288 6936

> On 13 Jan 2017, at 11:34, Timothy Hill  wrote:
> 
> Please excuse the naive way this question is formulated: I'm sure the
> Information & Library Science community has formal terms for what I'm
> attempting to do, but unfortunately I don't know what they are.
> 
> The problem I'm trying to solve is that I have a bunch of author names (for
> example, 'Charles Dickens') that I need to reformat into standard catalogue
> order ('Dickens, Charles'). Obviously the example given is trivial, but of
> course this can get quite complex depending on the addition of titles and
> honorifics.
> 
> Is anyone aware of a software library to perform this kind of conversion?
> The programming language used is not terribly important, though Java or
> Python would be preferable.
> 
> In ideal world the library would deal with the different conventions used
> in different languages and by different institutions - but anything would
> be better than the current split-on-comma approach I'm using right now.
> 
> Thanks,
> 
> Timothy Hill