Hmm. Is there a way to send other encodings to the server via the remote
API?

I'm on my way to Japan for a workshop where we'll be using my system and
Japanese-language documents are more efficiently stored in UTF-16 so my
expectation is that users will either already have documents in that
encoding or will create new ones. Of course, for the workshop we can limit
ourselves to UTF-8 but I'm trying to make the system as foolproof as
possible.

I think the issue with my script was that I was putting quotes around the
XML strings, which causes the server to treat it as a file path rather
than as XML to load. Once I fixed that then I was able to delete and add
files from my Ruby git hooks.

I'll have to get a better understanding of how Ruby handles arbitrary byte
sequences (this is where there's a little too much magic for my taste) but
I would expect that if I provide the remote API with a byte sequence that
starts with 0xFFFE, 0xFEFF, 0x003C003F, or 0x3C003F00 that it would treat
it as UTF-16.

Cheers,

E.
----
Eliot Kimber, Owner
Contrext, LLC
http://contrext.com




On 2/18/16, 4:58 PM, "Christian Grün" <christian.gr...@gmail.com> wrote:

>Hi Eliot,
>
>For most client bindings, files must indeed be sent in UTF-8, so I
>guess it’s also the case for the Ruby binding. If the sent bytes are
>correct UTF-8, everything should work be fine.
>
>Christian
>
>
>On Thu, Feb 18, 2016 at 6:08 PM, Eliot Kimber <ekim...@contrext.com>
>wrote:
>> This test document as a non-ascii character '〺' (\u303A), which I added
>>to
>> test handling of multi-byte characters.
>>
>> Ruby and the BaseX client seem to be handling the UTF-8 correctly but
>> UTF-16 didn't. I'm guessing it's Ruby's fault because it's treating the
>> bytes as a string and of course that's not going to work in a naive way.
>>
>> Cheers,
>>
>> E.
>> ----
>> Eliot Kimber, Owner
>> Contrext, LLC
>> http://contrext.com
>>
>>
>>
>>
>> On 2/18/16, 11:04 AM, "Eliot Kimber"
>> <basex-talk-boun...@mailman.uni-konstanz.de on behalf of
>> ekim...@contrext.com> wrote:
>>
>>>I turned my UTF-8 file into a UTF-16 file and trying to commit it to
>>>BaseX
>>>via the Ruby client it did not work:
>>>
>>>BaseXClient.rb:50:in `execute': Resource "/opt/basex/?" not found.
>>>(RuntimeError)
>>>
>>>Where "?" is some kind of "unrecognized character" indicator
>>>
>>>Cheers,
>>>
>>>E.
>>>
>>>
>>>----
>>>Eliot Kimber, Owner
>>>Contrext, LLC
>>>http://contrext.com
>>>
>>>
>>>
>>>
>>>On 2/18/16, 10:26 AM, "Eliot Kimber"
>>><basex-talk-boun...@mailman.uni-konstanz.de on behalf of
>>>ekim...@contrext.com> wrote:
>>>
>>>>I'm implementing server-side git hooks for use in GitLab under Docker
>>>>where Java is not available (at least that I can see). The hooks load
>>>>or
>>>>delete files from databases in BaseX.
>>>>
>>>>I'm trying to implement the hooks in Ruby (which is much more pleasant
>>>>than bash scripting in any case) and I'm using the BaseXClient.rb from
>>>>https://github.com/BaseXdb/basex/tree/master/basex-api/src/main/ruby
>>>>
>>>>I need to create or replace files by sending the bytes--I'd rather not
>>>>read the input file into a Ruby string and send that since I don't
>>>>trust
>>>>Ruby to not hose up the data (even when it's UTF-8 I still don't trust
>>>>it,
>>>>but I only started using Ruby yesterday so maybe my mistrust is
>>>>misplaced?).
>>>>
>>>>Using the AddExample.rb as guide, I'm doing this:
>>>>
>>>>(Earlier code to open or create database, which works).
>>>>
>>>>file = File.new("../../" + path, "rb")
>>>>        bytes = file.read
>>>>        file.close
>>>>        puts "file=/#{bytes}/"
>>>>        @basex.add(path, "#{bytes}")
>>>>
>>>>I also tried:
>>>>
>>>>@basex.add(path, bytes)
>>>>
>>>>
>>>>
>>>>And I get this result (I added some debugging messages to sendCmd()):
>>>>
>>>>ensureDatabase(): Checking database "_dfst^metadata^temp^master"...
>>>>BaseXResult: Database '_dfst^metadata^temp^master' was opened in 1.53
>>>>ms.
>>>>Added or modified file: "test-newname.xml"
>>>>file=/<test>This is a test 20</test>
>>>>/
>>>>
>>>>*** sendCmd():
>>>>cmd=
>>>>arg=test-newname.xml
>>>>input=<test>This is a test 20</test>
>>>>BaseXClient.rb:110:in `sendCmd': "test-newname.xml.xml" (Line 1):
>>>>Premature end of file. (RuntimeError)
>>>>
>>>>      from commit-hooks/git/server-side/BaseXClient.rb:64:in `add'
>>>>      from commit-hooks/git/server-side/post-receive:80:in `block in
>>>>update'
>>>>      from commit-hooks/git/server-side/post-receive:74:in `each'
>>>>      from commit-hooks/git/server-side/post-receive:74:in `update'
>>>>      from commit-hooks/git/server-side/post-receive:111:in `block in
>>>><main>'
>>>>      from commit-hooks/git/server-side/post-receive:103:in `each'
>>>>      from commit-hooks/git/server-side/post-receive:103:in `<main>'
>>>>Eliots-MBP:hooks ekimber$
>>>>
>>>>A couple of things here:
>>>>
>>>>
>>>>Where is the extra ".xml" in the target filename coming from?
>>>>
>>>>What is causing the premature end of file? It feels like it's trying
>>>>interpret the second argument as a filename rather than the data to be
>>>>loaded.
>>>>
>>>>If I use basex.execute("add to #{path} #{bytes}") it works but of
>>>>course
>>>>I
>>>>get duplicate files if I run the command twice.
>>>>
>>>>If I try:
>>>>
>>>>@basex.execute("replace #{path} #{bytes}")
>>>>
>>>>Then I get the same failure.
>>>>
>>>>
>>>>So something is not right.
>>>>
>>>>My Docker container is running 8.4.1 beta.
>>>>
>>>>What am I missing?
>>>>
>>>>Thanks,
>>>>
>>>>Eliot
>>>>----
>>>>Eliot Kimber, Owner
>>>>Contrext, LLC
>>>>http://contrext.com
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>


Reply via email to