Thanks for this, Owen. Obviously this will need a little work to develop into something suitable for my particular use-case - but then, that's what open source is all about ...
Thanks again, Tim On 13 January 2017 at 12:53, Owen Stephens <o...@ostephens.com> wrote: > So just out of curiosity I ran 2500 author names from the DOAJ through > this library (I used a version someone has kindly wrapped as a webservice > http://nameparse.herokuapp.com/?name=Firstname+Surname). The names were > just some I had handy so no real attempt to challenge the software. > > In general it seemed to do pretty well, but it isn’t perfect. In > particular two part given names or two part family names where the parts > are separated by a space end up with part of the name in the ‘middle name’. > This may not matter too much to you in cases where this affects the given > name, because you’ll end up with the same output string if you format as > {family name}, {first name} {middle name}. However in cases where the > surname is split by a space (as it is for my kids) then you end up with a > problem - e.g.: > > Jane Bloggs Doe - where the surname is ‘Bloggs Doe’, would end up being > converted to: Doe, Jane Bloggs instead of Bloggs Doe, Jane > > I tend to use OpenRefine to do this kind of work and this allows you to do > lookups on webservices such as the one I’ve used - so this is a pretty > useful addition to my toolset - thanks for asking the question! > > Owen > > > Owen Stephens > Owen Stephens Consulting > Web: http://www.ostephens.com > Email: o...@ostephens.com > Telephone: 0121 288 6936 > > > On 13 Jan 2017, at 11:34, Timothy Hill <timothy.d.h...@gmail.com> wrote: > > > > Please excuse the naive way this question is formulated: I'm sure the > > Information & Library Science community has formal terms for what I'm > > attempting to do, but unfortunately I don't know what they are. > > > > The problem I'm trying to solve is that I have a bunch of author names > (for > > example, 'Charles Dickens') that I need to reformat into standard > catalogue > > order ('Dickens, Charles'). Obviously the example given is trivial, but > of > > course this can get quite complex depending on the addition of titles and > > honorifics. > > > > Is anyone aware of a software library to perform this kind of conversion? > > The programming language used is not terribly important, though Java or > > Python would be preferable. > > > > In ideal world the library would deal with the different conventions used > > in different languages and by different institutions - but anything would > > be better than the current split-on-comma approach I'm using right now. > > > > Thanks, > > > > Timothy Hill >