Re: [CODE4LIB] Blacklight implementation at United States Holocaust Memorial Museum

2012-12-11 Thread ddwiggins
+1 -- would definitely like to read more about this process. We're doing 
something similar to pull data from multiple databases into Solr, but with a 
custom-built frontend (IE no Blacklight). This would be a very useful case 
study.
 
-David

 
__
 
David Dwiggins
Systems Librarian/Archivist, Historic New England
141 Cambridge Street, Boston, MA 02114
(617) 994-5948
ddwigg...@historicnewengland.org
http://www.historicnewengland.org
>>> Francis Kayiwa  12/11/2012 11:56 AM >>>
On Tue, Dec 11, 2012 at 11:50:38AM -0500, Levy, Michael wrote:
> For our commercial ILS, of course SolrMarc was the thing. I started with
> the default Blacklight configuration and customized that. The MARC export
> is completely updated nightly.
> 
> The collections management system provided a Java API, and programmers here
> (not me) created a Solr XML file, created nightly and updated into Solr.
You're doing this wrong if this is not an article on code4lib journal.

I'd read it! :-)

./fxk


-- 
There are people so addicted to exaggeration
that they can't tell the truth without lying.
-- Josh Billings


Re: [CODE4LIB] Blacklight implementation at United States Holocaust Memorial Museum

2012-12-11 Thread Francis Kayiwa
On Tue, Dec 11, 2012 at 11:50:38AM -0500, Levy, Michael wrote:
> For our commercial ILS, of course SolrMarc was the thing. I started with
> the default Blacklight configuration and customized that. The MARC export
> is completely updated nightly.
> 
> The collections management system provided a Java API, and programmers here
> (not me) created a Solr XML file, created nightly and updated into Solr.
> 
> For the two desktop database applications that power the photo archives and
> film and video archives, a programmer here created PHP interfaces to create
> a Solr XML file which is updated nightly. Same with the custom MSSQL
> application that sources the "Names Source" type records -- a PHP script
> runs nightly to create the Solr XML.
> 
> Another interesting feature is that some records are for internal display
> only. There's an intranet version on a server on our LAN and another on a
> web server. For each record, a Solr field  indicates whether it's OK for
> web exposure. I had to decide whether it made more sense to maintain two
> Solr indexes or a single Solr instance and decided on a single Solr
> instance. I was able to use some Blacklight configuration
> (config.default_solr_params and config.default_document_solr_params) to
> ensure those are filtered out. In addition, certain fields and certain
> facets are only displayed on the internal version. The switching is done in
> catalog_controller.rb based on where the application is sitting (via
> Rails.root). A few differences are also based on where the user is sitting
> (e.g. a user on our LAN viewing the public web version will get streaming
> media from our LAN and not from our streaming host).

You're doing this wrong if this is not an article on code4lib journal.

I'd read it! :-)

./fxk


-- 
There are people so addicted to exaggeration
that they can't tell the truth without lying.
-- Josh Billings


Re: [CODE4LIB] Blacklight implementation at United States Holocaust Memorial Museum

2012-12-11 Thread Levy, Michael
For our commercial ILS, of course SolrMarc was the thing. I started with
the default Blacklight configuration and customized that. The MARC export
is completely updated nightly.

The collections management system provided a Java API, and programmers here
(not me) created a Solr XML file, created nightly and updated into Solr.

For the two desktop database applications that power the photo archives and
film and video archives, a programmer here created PHP interfaces to create
a Solr XML file which is updated nightly. Same with the custom MSSQL
application that sources the "Names Source" type records -- a PHP script
runs nightly to create the Solr XML.

Another interesting feature is that some records are for internal display
only. There's an intranet version on a server on our LAN and another on a
web server. For each record, a Solr field  indicates whether it's OK for
web exposure. I had to decide whether it made more sense to maintain two
Solr indexes or a single Solr instance and decided on a single Solr
instance. I was able to use some Blacklight configuration
(config.default_solr_params and config.default_document_solr_params) to
ensure those are filtered out. In addition, certain fields and certain
facets are only displayed on the internal version. The switching is done in
catalog_controller.rb based on where the application is sitting (via
Rails.root). A few differences are also based on where the user is sitting
(e.g. a user on our LAN viewing the public web version will get streaming
media from our LAN and not from our streaming host).


On Tue, Dec 11, 2012 at 11:21 AM, Jonathan Rochkind wrote:

> Just curious, did you use Hydra for this project, or just straight
> Blacklight without Hydra?
>
> Esp if not Hydra, what tools did you end up using for indexing your
> content into Solr? (Only SolrMarc, all your content was already avail in
> Marc?)
>
>
> On 12/11/2012 11:10 AM, Levy, Michael wrote:
>
>> I posted the message below on the Blacklight Development group, and I was
>> encouraged to share with code4lib, so I'm reposting with some minor edits:
>>
>> I'd like to share a Blacklight implementation at the United States
>> Holocaust Memorial Museum that is available at
>> http://collections.ushmm.org/**searchIt's
>>  been in use in-house for about a
>> year, with constant improvements and additions.
>>
>> First, a tremendous thanks and kudos to all of the people involved in the
>> Blacklight project. I'm so grateful to everyone who worked on the project
>> and to those who have helped me with Blacklight, Ruby on Rails, and
>> SolrMarc.
>>
>> The various collecting units at the Museum use very different fields,
>> labels, vocabularies, and spellings. I had a lot of fun mapping them and
>> thinking about what sorts of fields might work together for searching. The
>> catalog records sources include: a commercial ILS; a commercial
>> collections
>> management system; two completely custom desktop database applications; a
>> spreadsheet; and a custom MSSQL database application. In addition, we have
>> a system that manages digitized assets that supplies some data.
>>
>> Selecting a project based on Ruby on Rails came at a cost, including the
>> learning curve involved with RoR and, moreso, due to the process of having
>> RoR established with our IT infrastructure group. (Thanks go to our IT
>> group as well!)
>>
>> I looked at some other really fine open source projects as well as
>> commercial products. Blacklight seemed optimal for our case because it
>> easily deals with any kind of metadata sources and it was a mature system
>> with a vibrant user/developer community.
>>
>> I'll highlight a few interesting features.
>>
>> Our collections management system supports relationships between records
>> including parent/child type relationships, e.g. between collection and the
>> items that comprise it. Here is a collection that has one archival
>> (document) collection plus several objects:
>> http://collections.ushmm.org/**search/catalog/irn508676
>> We also have another parent/child type of relationship, where a group at
>> the Museum catalogs victim or survivor lists. I could import those, and
>> because there's enough metadata to link to the archival collection they
>> are
>> part of, I can link them together. For example, this archival collection
>> http://collections.ushmm.org/**search/catalog/irn508286is
>>  linked to a number
>> of names source catalog records at the bottom, and each of those is linked
>> to the archival record as its source. These are done by doing a separate
>> Solr search for each item to see whether it's got a parent or children to
>> display near the bottom of the record.
>>
>> Many years ago the Museum developed a geographic database. One area where
>> the various collecting units catalog disparately is in location naming. I
>> s

Re: [CODE4LIB] Blacklight implementation at United States Holocaust Memorial Museum

2012-12-11 Thread Rosalyn Metz
Kudos to the USHMM team!


On Tue, Dec 11, 2012 at 11:21 AM, Jonathan Rochkind wrote:

> Just curious, did you use Hydra for this project, or just straight
> Blacklight without Hydra?
>
> Esp if not Hydra, what tools did you end up using for indexing your
> content into Solr? (Only SolrMarc, all your content was already avail in
> Marc?)
>
>
> On 12/11/2012 11:10 AM, Levy, Michael wrote:
>
>> I posted the message below on the Blacklight Development group, and I was
>> encouraged to share with code4lib, so I'm reposting with some minor edits:
>>
>> I'd like to share a Blacklight implementation at the United States
>> Holocaust Memorial Museum that is available at
>> http://collections.ushmm.org/**searchIt's
>>  been in use in-house for about a
>> year, with constant improvements and additions.
>>
>> First, a tremendous thanks and kudos to all of the people involved in the
>> Blacklight project. I'm so grateful to everyone who worked on the project
>> and to those who have helped me with Blacklight, Ruby on Rails, and
>> SolrMarc.
>>
>> The various collecting units at the Museum use very different fields,
>> labels, vocabularies, and spellings. I had a lot of fun mapping them and
>> thinking about what sorts of fields might work together for searching. The
>> catalog records sources include: a commercial ILS; a commercial
>> collections
>> management system; two completely custom desktop database applications; a
>> spreadsheet; and a custom MSSQL database application. In addition, we have
>> a system that manages digitized assets that supplies some data.
>>
>> Selecting a project based on Ruby on Rails came at a cost, including the
>> learning curve involved with RoR and, moreso, due to the process of having
>> RoR established with our IT infrastructure group. (Thanks go to our IT
>> group as well!)
>>
>> I looked at some other really fine open source projects as well as
>> commercial products. Blacklight seemed optimal for our case because it
>> easily deals with any kind of metadata sources and it was a mature system
>> with a vibrant user/developer community.
>>
>> I'll highlight a few interesting features.
>>
>> Our collections management system supports relationships between records
>> including parent/child type relationships, e.g. between collection and the
>> items that comprise it. Here is a collection that has one archival
>> (document) collection plus several objects:
>> http://collections.ushmm.org/**search/catalog/irn508676
>> We also have another parent/child type of relationship, where a group at
>> the Museum catalogs victim or survivor lists. I could import those, and
>> because there's enough metadata to link to the archival collection they
>> are
>> part of, I can link them together. For example, this archival collection
>> http://collections.ushmm.org/**search/catalog/irn508286is
>>  linked to a number
>> of names source catalog records at the bottom, and each of those is linked
>> to the archival record as its source. These are done by doing a separate
>> Solr search for each item to see whether it's got a parent or children to
>> display near the bottom of the record.
>>
>> Many years ago the Museum developed a geographic database. One area where
>> the various collecting units catalog disparately is in location naming. I
>> simply turned the names into a Solr synonyms file and then I highlight the
>> snippets in the index/list view. So that way, if you searched for L'viv
>> and
>> you got a hit on Lemberg or Lwow or L'vov, you'd know why you got it. Same
>> with Munich, München, Muenchen, Munchen, and for Lodz/Litzmannstadt. (Some
>> day would be nice to have the name expansion be switchable on or off.)
>>
>> Thumbnail (and larger) images from the archival records and objects come
>> from the collections management system for the Museum objects. Also
>> finding
>> aids for archival ("Document") records are currently managed in the CMS
>> system as doc, docx, or xls files and are delivered through Blacklight on
>> the detail page. For the photos and the historical film, the thumbnails
>> come from other sources based on the two custom desktop databases
>> mentioned
>> above.
>>
>> We have thousands of hours of oral history testimony in many languages
>> viewable from the Blacklight detail page as mp4 or mp3 files. The easiest
>> way to get to those is by limiting Record Type to Oral History, and Online
>> to "Yes":
>> http://collections.ushmm.org/**search/catalog?f[di_available]**
>> []=Yes&f[record_type_facet][]=**Oral+History
>>
>> I welcome feedback regarding the user interface, bug reports, and any
>> other
>> ideas you have, on the list or offline. (Plus I hope to meet some of you
>> at
>> code4lib 2013.)
>>
>> Cheers!
>>
>>
>>


Re: [CODE4LIB] Blacklight implementation at United States Holocaust Memorial Museum

2012-12-11 Thread Jonathan Rochkind
Just curious, did you use Hydra for this project, or just straight 
Blacklight without Hydra?


Esp if not Hydra, what tools did you end up using for indexing your 
content into Solr? (Only SolrMarc, all your content was already avail in 
Marc?)


On 12/11/2012 11:10 AM, Levy, Michael wrote:

I posted the message below on the Blacklight Development group, and I was
encouraged to share with code4lib, so I'm reposting with some minor edits:

I'd like to share a Blacklight implementation at the United States
Holocaust Memorial Museum that is available at
http://collections.ushmm.org/search It's been in use in-house for about a
year, with constant improvements and additions.

First, a tremendous thanks and kudos to all of the people involved in the
Blacklight project. I'm so grateful to everyone who worked on the project
and to those who have helped me with Blacklight, Ruby on Rails, and
SolrMarc.

The various collecting units at the Museum use very different fields,
labels, vocabularies, and spellings. I had a lot of fun mapping them and
thinking about what sorts of fields might work together for searching. The
catalog records sources include: a commercial ILS; a commercial collections
management system; two completely custom desktop database applications; a
spreadsheet; and a custom MSSQL database application. In addition, we have
a system that manages digitized assets that supplies some data.

Selecting a project based on Ruby on Rails came at a cost, including the
learning curve involved with RoR and, moreso, due to the process of having
RoR established with our IT infrastructure group. (Thanks go to our IT
group as well!)

I looked at some other really fine open source projects as well as
commercial products. Blacklight seemed optimal for our case because it
easily deals with any kind of metadata sources and it was a mature system
with a vibrant user/developer community.

I'll highlight a few interesting features.

Our collections management system supports relationships between records
including parent/child type relationships, e.g. between collection and the
items that comprise it. Here is a collection that has one archival
(document) collection plus several objects:
http://collections.ushmm.org/search/catalog/irn508676
We also have another parent/child type of relationship, where a group at
the Museum catalogs victim or survivor lists. I could import those, and
because there's enough metadata to link to the archival collection they are
part of, I can link them together. For example, this archival collection
http://collections.ushmm.org/search/catalog/irn508286 is linked to a number
of names source catalog records at the bottom, and each of those is linked
to the archival record as its source. These are done by doing a separate
Solr search for each item to see whether it's got a parent or children to
display near the bottom of the record.

Many years ago the Museum developed a geographic database. One area where
the various collecting units catalog disparately is in location naming. I
simply turned the names into a Solr synonyms file and then I highlight the
snippets in the index/list view. So that way, if you searched for L'viv and
you got a hit on Lemberg or Lwow or L'vov, you'd know why you got it. Same
with Munich, München, Muenchen, Munchen, and for Lodz/Litzmannstadt. (Some
day would be nice to have the name expansion be switchable on or off.)

Thumbnail (and larger) images from the archival records and objects come
from the collections management system for the Museum objects. Also finding
aids for archival ("Document") records are currently managed in the CMS
system as doc, docx, or xls files and are delivered through Blacklight on
the detail page. For the photos and the historical film, the thumbnails
come from other sources based on the two custom desktop databases mentioned
above.

We have thousands of hours of oral history testimony in many languages
viewable from the Blacklight detail page as mp4 or mp3 files. The easiest
way to get to those is by limiting Record Type to Oral History, and Online
to "Yes":
http://collections.ushmm.org/search/catalog?f[di_available][]=Yes&f[record_type_facet][]=Oral+History

I welcome feedback regarding the user interface, bug reports, and any other
ideas you have, on the list or offline. (Plus I hope to meet some of you at
code4lib 2013.)

Cheers!




[CODE4LIB] Blacklight implementation at United States Holocaust Memorial Museum

2012-12-11 Thread Levy, Michael
I posted the message below on the Blacklight Development group, and I was
encouraged to share with code4lib, so I'm reposting with some minor edits:

I'd like to share a Blacklight implementation at the United States
Holocaust Memorial Museum that is available at
http://collections.ushmm.org/search It's been in use in-house for about a
year, with constant improvements and additions.

First, a tremendous thanks and kudos to all of the people involved in the
Blacklight project. I'm so grateful to everyone who worked on the project
and to those who have helped me with Blacklight, Ruby on Rails, and
SolrMarc.

The various collecting units at the Museum use very different fields,
labels, vocabularies, and spellings. I had a lot of fun mapping them and
thinking about what sorts of fields might work together for searching. The
catalog records sources include: a commercial ILS; a commercial collections
management system; two completely custom desktop database applications; a
spreadsheet; and a custom MSSQL database application. In addition, we have
a system that manages digitized assets that supplies some data.

Selecting a project based on Ruby on Rails came at a cost, including the
learning curve involved with RoR and, moreso, due to the process of having
RoR established with our IT infrastructure group. (Thanks go to our IT
group as well!)

I looked at some other really fine open source projects as well as
commercial products. Blacklight seemed optimal for our case because it
easily deals with any kind of metadata sources and it was a mature system
with a vibrant user/developer community.

I'll highlight a few interesting features.

Our collections management system supports relationships between records
including parent/child type relationships, e.g. between collection and the
items that comprise it. Here is a collection that has one archival
(document) collection plus several objects:
http://collections.ushmm.org/search/catalog/irn508676
We also have another parent/child type of relationship, where a group at
the Museum catalogs victim or survivor lists. I could import those, and
because there's enough metadata to link to the archival collection they are
part of, I can link them together. For example, this archival collection
http://collections.ushmm.org/search/catalog/irn508286 is linked to a number
of names source catalog records at the bottom, and each of those is linked
to the archival record as its source. These are done by doing a separate
Solr search for each item to see whether it's got a parent or children to
display near the bottom of the record.

Many years ago the Museum developed a geographic database. One area where
the various collecting units catalog disparately is in location naming. I
simply turned the names into a Solr synonyms file and then I highlight the
snippets in the index/list view. So that way, if you searched for L'viv and
you got a hit on Lemberg or Lwow or L'vov, you'd know why you got it. Same
with Munich, München, Muenchen, Munchen, and for Lodz/Litzmannstadt. (Some
day would be nice to have the name expansion be switchable on or off.)

Thumbnail (and larger) images from the archival records and objects come
from the collections management system for the Museum objects. Also finding
aids for archival ("Document") records are currently managed in the CMS
system as doc, docx, or xls files and are delivered through Blacklight on
the detail page. For the photos and the historical film, the thumbnails
come from other sources based on the two custom desktop databases mentioned
above.

We have thousands of hours of oral history testimony in many languages
viewable from the Blacklight detail page as mp4 or mp3 files. The easiest
way to get to those is by limiting Record Type to Oral History, and Online
to "Yes":
http://collections.ushmm.org/search/catalog?f[di_available][]=Yes&f[record_type_facet][]=Oral+History

I welcome feedback regarding the user interface, bug reports, and any other
ideas you have, on the list or offline. (Plus I hope to meet some of you at
code4lib 2013.)

Cheers!