Hello Bill (and others who might be interested)

> First off, I would recommend that you always capitalize the 
> "R", since when I read the original message, I wondered why 
> someone would create a web site called "we're late"? ;^)

Good point.  It's kind of a play on words, since "late" is another word for
"dead," but that's probably not obvious.

> The site looks good, but I didn't see a place for feedback, 
> so I'm sending it here.

I'll make that more explicit.

> The search has a strange feature.  I was looking for the 
> surname Leiden in Pennsylvania, and the first few entries 
> were from a town called Leiden in the Netherlands.  I'm 
> wondering if the index doesn't distinguish between place 
> names and people names.

The site doesn't do named-entity recognition currently so it can't
distinguish between Leiden used as a surname vs. as a place.  The reason to
have different entry fields for surname and place is if you enter Leiden as
a surname and check "include related names," we look up the names related to
Leiden in the surname database and expand the query to include them, but
weighted lower than an exact match.  Whereas if you enter Leiden as a place
we expand the query with the alternate names for Leiden in the places
database (Leida, Leyde, Leyden, and Lugdunum Batavorum).  I need to explain
this on the search page.

One issue right now is that Leiden is a pretty rare name - not in the top
100,000 surnames in the Social Security Death Index - so I haven't created a
surname page for Leiden.  However, it does appear as a related name on 18
other surname pages.  What I need to do in this case is expand the query
with those 18 other names.  I'll get around to doing that soon.

> I would assume that in most cases the person name should have 
> the highest priority, followed by the place name.  Also, 
> place names could have a multi-tier priority.  If I specify 
> city, county, state parameters, then a match on all three 
> fields should come first, followed by matches of county and 
> state, followed by state.

This will all be much easier once I get the name-entity recognition in
place.  That's part of the plan, but it's harder than you might think
because there's a lot of genealogical data that's in tables and charts, and
traditional named-entity recognizers expect entities to be located within
well-formed paragraphs of text.  We've made some preliminary progress in
this area though and hope to have named-entity recognizers in place over the
next 4-6 months.

> BTW - it wasn't clear to me what was wanted for place names.  
> There are two text boxes: "Place" and "Located in".  Should 
> "Located in" 
> have county, state, and/or country?  I would suggest either 
> separate boxes for city, county/municipality, state (or 
> equivalent), and country, or some notes to explain what is 
> wanted where.

Good point.  I followed the convention used in Place Search in the Family
History Library Catalog, where "place" is the place you're searching for and
"located in" is used to restrict the set of matching places and can be any
place up the hierarchy (so Place=Provo returns Provo, Arkansas; Provo,
Kentucky; etc., but Place=Provo&LocatedIn=Utah returns just Provo, Utah, and
Place=Provo&LocatedIn=Utah County also returns just Provo, Utah).  It sounds
like I need to make this clearer.

Thank you so much for the excellent feedback!

-dallan


_______________________________________________
Ldsoss mailing list
[email protected]
http://lists.ldsoss.org/mailman/listinfo/ldsoss

Reply via email to