I should clarify that I made the Profile db file searchable: https://corpora.tika.apache.org/base/metadata/tika-eval/tika-eval-1.24.1.mv.db.gz.
I didn't load the mimes csvs, but I can certainly do that as well. On Wed, Jun 17, 2020 at 11:33 AM Tim Allison <talli...@apache.org> wrote: > All, > > I have Datasette working locally. I converted h2 to sqlite trivially. > > Datasette is pretty slick, especially if we document example sql calls. > It works quite easily from Docker and only allows "SELECT" calls...I tried > to drop/insert/update/modify with (fortunately) no luck. > > Are there any objections to opening a port and launching this on our > server? If no objections, any preference for port? > > Cheers, > > Tim > > > > On Wed, Jun 17, 2020 at 9:04 AM Tim Allison <talli...@apache.org> wrote: > >> Downloading the entire db and then running it locally with unfamiliar >> code isn’t easy enough?! >> >> But seriously, will look into Datasette. Thank you! >> >> Happy to set up Postgres as well. >> >> On Wed, Jun 17, 2020 at 8:19 AM Nick Burch <n...@apache.org> wrote: >> >>> Hi All >>> >>> As I understand it (which might be wrong!), Tim is generating a bunch of >>> reports on things in the corpa / how different tools analyse the corpa / >>> how Tika works on the stuff there, mostly as SQL databases >>> >>> Those databases are then available to anyone who is interest to download >>> and analyse locally from eg >>> https://corpora.tika.apache.org/base/metadata/mimes/ >>> (though that URL isn't working right now, hopefully fixed soon) >>> >>> There's a fairly new project called Datasette, which is a really nice >>> publishing and exploring interface on top of SQL databases, especially >>> aimed at archivists, journalists etc - >>> https://github.com/simonw/datasette >>> >>> I wonder (though I won't have time for a few weeks to try myself...) if >>> it'd be worth stuffing one or two of the SQL reports into a copy of >>> datasette hosted on the vm, to let people more easily explore the data? >>> >>> Cheers >>> Nick >>> >>