I should clarify that I made the Profile db file searchable:
https://corpora.tika.apache.org/base/metadata/tika-eval/tika-eval-1.24.1.mv.db.gz.


I didn't load the mimes csvs, but I can certainly do that as well.

On Wed, Jun 17, 2020 at 11:33 AM Tim Allison <talli...@apache.org> wrote:

> All,
>
>   I have Datasette working locally. I converted h2 to sqlite trivially.
>
>   Datasette is pretty slick, especially if we document example sql calls.
> It works quite easily from Docker and only allows "SELECT" calls...I tried
> to drop/insert/update/modify with (fortunately) no luck.
>
>   Are there any objections to opening a port and launching this on our
> server?  If no objections, any preference for port?
>
>      Cheers,
>
>                Tim
>
>
>
> On Wed, Jun 17, 2020 at 9:04 AM Tim Allison <talli...@apache.org> wrote:
>
>> Downloading the entire db and then running it locally with unfamiliar
>> code isn’t easy enough?!
>>
>> But seriously, will look into Datasette. Thank you!
>>
>> Happy to set up Postgres as well.
>>
>> On Wed, Jun 17, 2020 at 8:19 AM Nick Burch <n...@apache.org> wrote:
>>
>>> Hi All
>>>
>>> As I understand it (which might be wrong!), Tim is generating a bunch of
>>> reports on things in the corpa / how different tools analyse the corpa /
>>> how Tika works on the stuff there, mostly as SQL databases
>>>
>>> Those databases are then available to anyone who is interest to download
>>> and analyse locally from eg
>>> https://corpora.tika.apache.org/base/metadata/mimes/
>>> (though that URL isn't working right now, hopefully fixed soon)
>>>
>>> There's a fairly new project called Datasette, which is a really nice
>>> publishing and exploring interface on top of SQL databases, especially
>>> aimed at archivists, journalists etc -
>>> https://github.com/simonw/datasette
>>>
>>> I wonder (though I won't have time for a few weeks to try myself...) if
>>> it'd be worth stuffing one or two of the SQL reports into a copy of
>>> datasette hosted on the vm, to let people more easily explore the data?
>>>
>>> Cheers
>>> Nick
>>>
>>

Reply via email to