Re: [basex-talk] Binary module
If you want to count the number of required bits, you could something as follows: declare function local:bits($n as xs:integer) as xs:integer { ($n idiv 2) ! (if (.) then 1 + local:bits(.) else 0) }; local:bits(65535) There are probably various solutions to solve the problem; suggestions are welcome. On Wed, Aug 14, 2019 at 1:07 PM Christian Grün wrote: > > Hi Giuseppe, > > > Thanks! I missed that specification. Is there any reason why only the first > > octet is provided? More in general, I was interested to test how many > > bits/octects are used to represent an integer. > > The function was mostly introduced to convert byte sequences to a > binary representation and vice versa. The xs:integer data type because > it’s the predominant numeric type in XQuery (otherwise, the input > would have needed to be cast to xs:byte). > > > > > > > > > > On Aug 14, 2019, at 10:58 AM, Christian Grün > > wrote: > > > > bin:length(convert:integers-to-base64(1 to 1000) > > > >
Re: [basex-talk] Binary module
Hi Giuseppe, > Thanks! I missed that specification. Is there any reason why only the first > octet is provided? More in general, I was interested to test how many > bits/octects are used to represent an integer. The function was mostly introduced to convert byte sequences to a binary representation and vice versa. The xs:integer data type because it’s the predominant numeric type in XQuery (otherwise, the input would have needed to be cast to xs:byte). > > > On Aug 14, 2019, at 10:58 AM, Christian Grün > wrote: > > bin:length(convert:integers-to-base64(1 to 1000) > >
Re: [basex-talk] Binary module
Hi Christian, Thanks! I missed that specification. Is there any reason why only the first octet is provided? More in general, I was interested to test how many bits/octects are used to represent an integer. > On Aug 14, 2019, at 10:58 AM, Christian Grün > wrote: > > bin:length(convert:integers-to-base64(1 to 1000)
Re: [basex-talk] Binary module
Hi Giueseppe, If you call convert:integers-to-base64, only the first 8 bits of the supplied integers will be considered [1]. As a result, the binary length of a single integer as argument will always be 1. Things are different if you supply a sequence of integers to the function: (: yields 10'000'000 :) bin:length(convert:integers-to-base64(1 to 1000)) Hope this helps, Christian [1] http://docs.basex.org/wiki/Conversion_Module#convert:integers-to-base64 On Wed, Aug 14, 2019 at 10:50 AM Giuseppe G. A. Celano wrote: > > Thank you! This is very helpful. As to the bytes of an integer, I would > assume that, since I always get one byte, its size is not 8 bits. In this > case, I should also get values bigger than 255, but I cannot actually get > that, no matter what the value of the integer is (e.g., > convert:integers-to-base64(34777) => convert:binary-to-integers() ). I notice > that the result for int 0 is 0 and for int 256 is also 0: it seems it alway > outputs octet values, even if we need more than 1 octet. > > > On Aug 14, 2019, at 9:54 AM, Michael Seiferle wrote: > > Hi Guiseppe, > > 1) You are right, it’s not too obvious what’s happening here: > In BaseX the default serialization mode is „basex“ which tries to return > items in a more readable way, hence internally your string is represented as > xs:base64Binary — but when it is output to the query results panel it will be > serialized according to whichever serialization is active; we opted for our > custom serialization as we considered it to be a sane default, as it is able > to serialize() all kinds of items (i.e. XML serialization won’t serialize > maps or arrays) and is generally a little more readable than adaptive because > we omit type information such as xs:base64Binary("c8Ogw6A=") > > The following query serializes your value with different serialization > parameters. > > let $it := convert:string-to-base64("sàà") > > for $method in ( > "basex", "xml", "adaptive", "json", "text","html" > ) > return element { $method } { > serialize( > $it, > map{ "method": $method } > ) > } > > > > 2) The length method basically tells you how many bytes your string needs to > be encoded as utf8, you may use the following query and try it yourself: > > let $str := "•" > let $bin := convert:string-to-base64($str) > return element _ { > element str { $str }, > element b64 { $bin }, > element octets { >attribute length { bin:length($bin) }, >for $byte in $bin => convert:binary-to-integers() >return element octet { > $byte => convert:integer-to-base(2) >} > } > } > > > Depending on the input string and it’s encoding (i.e. ‚a' will only need one > byte, but ‚ä' already needs two, ‚•' will even need three in utf8) your > string is converted to xs:base64Binary and the bin:length() function will > count how many octets are needed for this representation. > > Hope this helps :-) > > Best > Michael > > > Am 14.08.2019 um 02:22 schrieb Giuseppe G. A. Celano > : > > Hi > > I am playing around with the binary module. I have two simple questions: > > 1) convert:string-to-base64("sàà") returns sàà : what does it mean? I see in > the documentation that I need to use string() to see the value of > xs:base64Binary (c8Ogw6A=), but shouldn't xs:base64Binary already be > outputted as c8Ogw6A= ? (This is actually displayed in the Info Panel). > > 2) bin:length(convert:integers-to-base64(x)) always returns one number, no > matter how big the number is. In the documentation I read that the output of > bin:length should be the size of binary data in octects: how is that possible? > > Thanks, > Giuseppe > > > >
Re: [basex-talk] Binary module
Thank you! This is very helpful. As to the bytes of an integer, I would assume that, since I always get one byte, its size is not 8 bits. In this case, I should also get values bigger than 255, but I cannot actually get that, no matter what the value of the integer is (e.g., convert:integers-to-base64(34777) => convert:binary-to-integers() ). I notice that the result for int 0 is 0 and for int 256 is also 0: it seems it alway outputs octet values, even if we need more than 1 octet. > On Aug 14, 2019, at 9:54 AM, Michael Seiferle wrote: > > Hi Guiseppe, > > 1) You are right, it’s not too obvious what’s happening here: > In BaseX the default serialization mode is „basex“ which tries to return > items in a more readable way, hence internally your string is represented as > xs:base64Binary — but when it is output to the query results panel it will be > serialized according to whichever serialization is active; we opted for our > custom serialization as we considered it to be a sane default, as it is able > to serialize() all kinds of items (i.e. XML serialization won’t serialize > maps or arrays) and is generally a little more readable than adaptive because > we omit type information such as xs:base64Binary("c8Ogw6A=") > > The following query serializes your value with different serialization > parameters. >> let $it := convert:string-to-base64("sàà") >> >> for $method in ( >> "basex", "xml", "adaptive", "json", "text","html" >> ) >> return element { $method } { >> serialize( >> $it, >> map{ "method": $method } >> ) >> } > > > > 2) The length method basically tells you how many bytes your string needs to > be encoded as utf8, you may use the following query and try it yourself: >> let $str := "•" >> let $bin := convert:string-to-base64($str) >> return element _ { >> element str { $str }, >> element b64 { $bin }, >> element octets { >>attribute length { bin:length($bin) }, >>for $byte in $bin => convert:binary-to-integers() >>return element octet { >> $byte => convert:integer-to-base(2) >>} >> } >> } > > Depending on the input string and it’s encoding (i.e. ‚a' will only need one > byte, but ‚ä' already needs two, ‚•' will even need three in utf8) your > string is converted to xs:base64Binary and the bin:length() function will > count how many octets are needed for this representation. > > Hope this helps :-) > > Best > Michael > > >> Am 14.08.2019 um 02:22 schrieb Giuseppe G. A. Celano >> mailto:cel...@informatik.uni-leipzig.de>>: >> >> Hi >> >> I am playing around with the binary module. I have two simple questions: >> >> 1) convert:string-to-base64("sàà") returns sàà : what does it mean? I see in >> the documentation that I need to use string() to see the value of >> xs:base64Binary (c8Ogw6A=), but shouldn't xs:base64Binary already be >> outputted as c8Ogw6A= ? (This is actually displayed in the Info Panel). >> >> 2) bin:length(convert:integers-to-base64(x)) always returns one number, no >> matter how big the number is. In the documentation I read that the output of >> bin:length should be the size of binary data in octects: how is that >> possible? >> >> Thanks, >> Giuseppe >> >> >
Re: [basex-talk] Binary module
Hi Guiseppe, 1) You are right, it’s not too obvious what’s happening here: In BaseX the default serialization mode is „basex“ which tries to return items in a more readable way, hence internally your string is represented as xs:base64Binary — but when it is output to the query results panel it will be serialized according to whichever serialization is active; we opted for our custom serialization as we considered it to be a sane default, as it is able to serialize() all kinds of items (i.e. XML serialization won’t serialize maps or arrays) and is generally a little more readable than adaptive because we omit type information such as xs:base64Binary("c8Ogw6A=") The following query serializes your value with different serialization parameters. > let $it := convert:string-to-base64("sàà") > > for $method in ( > "basex", "xml", "adaptive", "json", "text","html" > ) > return element { $method } { > serialize( > $it, > map{ "method": $method } > ) > } 2) The length method basically tells you how many bytes your string needs to be encoded as utf8, you may use the following query and try it yourself: > let $str := "•" > let $bin := convert:string-to-base64($str) > return element _ { > element str { $str }, > element b64 { $bin }, > element octets { >attribute length { bin:length($bin) }, >for $byte in $bin => convert:binary-to-integers() >return element octet { > $byte => convert:integer-to-base(2) >} > } > } Depending on the input string and it’s encoding (i.e. ‚a' will only need one byte, but ‚ä' already needs two, ‚•' will even need three in utf8) your string is converted to xs:base64Binary and the bin:length() function will count how many octets are needed for this representation. Hope this helps :-) Best Michael > Am 14.08.2019 um 02:22 schrieb Giuseppe G. A. Celano > : > > Hi > > I am playing around with the binary module. I have two simple questions: > > 1) convert:string-to-base64("sàà") returns sàà : what does it mean? I see in > the documentation that I need to use string() to see the value of > xs:base64Binary (c8Ogw6A=), but shouldn't xs:base64Binary already be > outputted as c8Ogw6A= ? (This is actually displayed in the Info Panel). > > 2) bin:length(convert:integers-to-base64(x)) always returns one number, no > matter how big the number is. In the documentation I read that the output of > bin:length should be the size of binary data in octects: how is that possible? > > Thanks, > Giuseppe > >