Re: [basex-talk] Binary module

2019-08-14 Thread Christian Grün
If you want to count the number of required bits, you could something
as follows:

  declare function local:bits($n as xs:integer) as xs:integer {
($n idiv 2) ! (if (.) then 1 + local:bits(.) else 0)
  };
  local:bits(65535)

There are probably various solutions to solve the problem; suggestions
are welcome.




On Wed, Aug 14, 2019 at 1:07 PM Christian Grün
 wrote:
>
> Hi Giuseppe,
>
> > Thanks! I missed that specification. Is there any reason why only the first 
> > octet is provided? More in general, I was interested to test how many 
> > bits/octects are used to represent an integer.
>
> The function was mostly introduced to convert byte sequences to a
> binary representation and vice versa. The xs:integer data type because
> it’s the predominant numeric type in XQuery (otherwise, the input
> would have needed to be cast to xs:byte).
>
>
>
>
> >
> >
> > On Aug 14, 2019, at 10:58 AM, Christian Grün  
> > wrote:
> >
> > bin:length(convert:integers-to-base64(1 to 1000)
> >
> >


Re: [basex-talk] Binary module

2019-08-14 Thread Christian Grün
Hi Giuseppe,

> Thanks! I missed that specification. Is there any reason why only the first 
> octet is provided? More in general, I was interested to test how many 
> bits/octects are used to represent an integer.

The function was mostly introduced to convert byte sequences to a
binary representation and vice versa. The xs:integer data type because
it’s the predominant numeric type in XQuery (otherwise, the input
would have needed to be cast to xs:byte).




>
>
> On Aug 14, 2019, at 10:58 AM, Christian Grün  
> wrote:
>
> bin:length(convert:integers-to-base64(1 to 1000)
>
>


Re: [basex-talk] Binary module

2019-08-14 Thread Giuseppe G. A. Celano
Hi Christian,

Thanks! I missed that specification. Is there any reason why only the first 
octet is provided? More in general, I was interested to test how many 
bits/octects are used to represent an integer.


> On Aug 14, 2019, at 10:58 AM, Christian Grün  
> wrote:
> 
> bin:length(convert:integers-to-base64(1 to 1000)



Re: [basex-talk] Binary module

2019-08-14 Thread Christian Grün
Hi Giueseppe,

If you call convert:integers-to-base64, only the first 8 bits of the
supplied integers will be considered [1]. As a result, the binary
length of a single integer as argument will always be 1. Things are
different if you supply a sequence of integers to the function:

(: yields 10'000'000 :)
bin:length(convert:integers-to-base64(1 to 1000))

Hope this helps,
Christian

[1] http://docs.basex.org/wiki/Conversion_Module#convert:integers-to-base64




On Wed, Aug 14, 2019 at 10:50 AM Giuseppe G. A. Celano
 wrote:
>
> Thank you! This is very helpful. As to the bytes of an integer, I would 
> assume that, since I always get one byte, its size is not 8 bits. In this 
> case, I should also get values bigger than 255, but I cannot actually get 
> that, no matter what the value of the integer is (e.g., 
> convert:integers-to-base64(34777) => convert:binary-to-integers() ). I notice 
> that the result for int 0 is 0 and for int 256 is also 0: it seems it alway 
> outputs octet values, even if we need more than 1 octet.
>
>
> On Aug 14, 2019, at 9:54 AM, Michael Seiferle  wrote:
>
> Hi Guiseppe,
>
> 1) You are right, it’s not too obvious what’s happening here:
> In BaseX the default serialization mode is „basex“ which tries to return 
> items in a more readable way, hence internally your string is represented as 
> xs:base64Binary — but when it is output to the query results panel it will be 
> serialized according to whichever serialization is active; we opted for our 
> custom serialization as we considered it to be a sane default, as it is able 
> to serialize() all kinds of items (i.e. XML serialization won’t serialize 
> maps or arrays) and is generally a little more readable than adaptive because 
> we omit type information such as xs:base64Binary("c8Ogw6A=")
>
> The following query serializes your value with different serialization 
> parameters.
>
> let $it := convert:string-to-base64("sàà")
>
> for $method in (
>   "basex", "xml", "adaptive", "json", "text","html"
> )
> return element { $method } {
>   serialize(
> $it,
> map{ "method": $method }
>   )
> }
>
>
>
> 2) The length method basically tells you how many bytes your string needs to 
> be encoded as utf8, you may use the following query and try it yourself:
>
> let $str := "•"
> let $bin := convert:string-to-base64($str)
> return element _ {
>   element str { $str },
>   element b64 { $bin },
>   element octets {
>attribute length { bin:length($bin) },
>for $byte in $bin => convert:binary-to-integers()
>return element octet {
>  $byte => convert:integer-to-base(2)
>}
>   }
> }
>
>
> Depending on the input string and it’s encoding (i.e. ‚a' will only need one 
> byte, but ‚ä' already needs two, ‚•' will even need three in utf8) your 
> string is converted to xs:base64Binary and the bin:length() function will 
> count how many octets are needed for this representation.
>
> Hope this helps :-)
>
> Best
> Michael
>
>
> Am 14.08.2019 um 02:22 schrieb Giuseppe G. A. Celano 
> :
>
> Hi
>
> I am playing around with the binary module. I have two simple questions:
>
> 1) convert:string-to-base64("sàà") returns sàà : what does it mean? I see in 
> the documentation that I need to use string() to see the value of 
> xs:base64Binary (c8Ogw6A=), but shouldn't xs:base64Binary already be 
> outputted as c8Ogw6A= ? (This is actually displayed in the Info Panel).
>
> 2) bin:length(convert:integers-to-base64(x)) always returns one number, no 
> matter how big the number is. In the documentation I read that the output of 
> bin:length should be the size of binary data in octects: how is that possible?
>
> Thanks,
> Giuseppe
>
>
>
>


Re: [basex-talk] Binary module

2019-08-14 Thread Giuseppe G. A. Celano
Thank you! This is very helpful. As to the bytes of an integer, I would assume 
that, since I always get one byte, its size is not 8 bits. In this case, I 
should also get values bigger than 255, but I cannot actually get that, no 
matter what the value of the integer is (e.g., 
convert:integers-to-base64(34777) => convert:binary-to-integers() ). I notice 
that the result for int 0 is 0 and for int 256 is also 0: it seems it alway 
outputs octet values, even if we need more than 1 octet.


> On Aug 14, 2019, at 9:54 AM, Michael Seiferle  wrote:
> 
> Hi Guiseppe, 
> 
> 1) You are right, it’s not too obvious what’s happening here: 
> In BaseX the default serialization mode is „basex“ which tries to return 
> items in a more readable way, hence internally your string is represented as 
> xs:base64Binary — but when it is output to the query results panel it will be 
> serialized according to whichever serialization is active; we opted for our 
> custom serialization as we considered it to be a sane default, as it is able 
> to serialize() all kinds of items (i.e. XML serialization won’t serialize 
> maps or arrays) and is generally a little more readable than adaptive because 
> we omit type information such as xs:base64Binary("c8Ogw6A=")
> 
> The following query serializes your value with different serialization 
> parameters. 
>> let $it := convert:string-to-base64("sàà")
>> 
>> for $method in (
>>   "basex", "xml", "adaptive", "json", "text","html"
>> )
>> return element { $method } {  
>>   serialize( 
>> $it,
>> map{ "method": $method }
>>   )
>> }
> 
> 
> 
> 2) The length method basically tells you how many bytes your string needs to 
> be encoded as utf8, you may use the following query and try it yourself:
>> let $str := "•"
>> let $bin := convert:string-to-base64($str)
>> return element _ {
>>   element str { $str },
>>   element b64 { $bin },
>>   element octets {
>>attribute length { bin:length($bin) },
>>for $byte in $bin => convert:binary-to-integers() 
>>return element octet {
>>  $byte => convert:integer-to-base(2)
>>}
>>   }  
>> }
> 
> Depending on the input string and it’s encoding (i.e. ‚a' will only need one 
> byte, but ‚ä' already needs two, ‚•' will even need three in utf8) your 
> string is converted to xs:base64Binary and the bin:length() function will 
> count how many octets are needed for this representation.
> 
> Hope this helps :-)
> 
> Best
> Michael
> 
> 
>> Am 14.08.2019 um 02:22 schrieb Giuseppe G. A. Celano 
>> mailto:cel...@informatik.uni-leipzig.de>>:
>> 
>> Hi
>> 
>> I am playing around with the binary module. I have two simple questions:
>> 
>> 1) convert:string-to-base64("sàà") returns sàà : what does it mean? I see in 
>> the documentation that I need to use string() to see the value of 
>> xs:base64Binary (c8Ogw6A=), but shouldn't xs:base64Binary already be 
>> outputted as c8Ogw6A= ? (This is actually displayed in the Info Panel).
>> 
>> 2) bin:length(convert:integers-to-base64(x)) always returns one number, no 
>> matter how big the number is. In the documentation I read that the output of 
>> bin:length should be the size of binary data in octects: how is that 
>> possible?
>> 
>> Thanks,
>> Giuseppe
>> 
>> 
> 



Re: [basex-talk] Binary module

2019-08-14 Thread Michael Seiferle
Hi Guiseppe, 

1) You are right, it’s not too obvious what’s happening here: 
In BaseX the default serialization mode is „basex“ which tries to return items 
in a more readable way, hence internally your string is represented as 
xs:base64Binary — but when it is output to the query results panel it will be 
serialized according to whichever serialization is active; we opted for our 
custom serialization as we considered it to be a sane default, as it is able to 
serialize() all kinds of items (i.e. XML serialization won’t serialize maps or 
arrays) and is generally a little more readable than adaptive because we omit 
type information such as xs:base64Binary("c8Ogw6A=")

The following query serializes your value with different serialization 
parameters. 
> let $it := convert:string-to-base64("sàà")
> 
> for $method in (
>   "basex", "xml", "adaptive", "json", "text","html"
> )
> return element { $method } {  
>   serialize( 
> $it,
> map{ "method": $method }
>   )
> }



2) The length method basically tells you how many bytes your string needs to be 
encoded as utf8, you may use the following query and try it yourself:
> let $str := "•"
> let $bin := convert:string-to-base64($str)
> return element _ {
>   element str { $str },
>   element b64 { $bin },
>   element octets {
>attribute length { bin:length($bin) },
>for $byte in $bin => convert:binary-to-integers() 
>return element octet {
>  $byte => convert:integer-to-base(2)
>}
>   }  
> }

Depending on the input string and it’s encoding (i.e. ‚a' will only need one 
byte, but ‚ä' already needs two, ‚•' will even need three in utf8) your string 
is converted to xs:base64Binary and the bin:length() function will count how 
many octets are needed for this representation.

Hope this helps :-)

Best
Michael


> Am 14.08.2019 um 02:22 schrieb Giuseppe G. A. Celano 
> :
> 
> Hi
> 
> I am playing around with the binary module. I have two simple questions:
> 
> 1) convert:string-to-base64("sàà") returns sàà : what does it mean? I see in 
> the documentation that I need to use string() to see the value of 
> xs:base64Binary (c8Ogw6A=), but shouldn't xs:base64Binary already be 
> outputted as c8Ogw6A= ? (This is actually displayed in the Info Panel).
> 
> 2) bin:length(convert:integers-to-base64(x)) always returns one number, no 
> matter how big the number is. In the documentation I read that the output of 
> bin:length should be the size of binary data in octects: how is that possible?
> 
> Thanks,
> Giuseppe
> 
>