Hello,
Subversion console client try to detect binary file with algorythm:
- File is NOT BINARY if it contains only BOM UTF-8 signature (why not check as first N bytes is corret UTF-8?);
- File is BINARY if first 1024 bytes contains ZERO byte (uniform distribution of bytes takes change of absent ZERO byte: (1 - 1 / 256) ^ 1024 = ~1.8%);
- File is BINARY if first 1024 bytes contains over 85% of characters not in range 0x07-0x0D, 0x20-0x7F (total we have 153 "binary" bytes, ~60%).
This algoritm looks like broken.
For example:
- File "text.txt":
Is file contains text block from wikipedia about subversion in UTF-8 (https://ru.wikipedia.org/wiki/Subversion) and unfortunaly contains too many cyrillic charactes (on character - 2 "binary" bytes). - File "binary.txt" detected as "text"
It was created by "dd if=/dev/urandom of=binary.txt count=1 bs=2048" and unfortunaly does not contains ZERO byte in first 1024 bytes.
--
Best regards,
Navrotskiy Artem
6$ÓAQõ^¥ÝFBfÝex}:^ã`ìØÌ·$N;÷Þ³Sé}ç9Ú¨âÞÞýe/ùò¯¹[ækÎY-ú"ÛQ)_+_±³¼n.çØÃb¤ç-Ú³Ç`·ËÙ=â æFË$dÞÉâú CJcu¢ÝÇWîëw¾Vöex.6RÙ'ÞôWµìqP÷»ëÀï!¸à¯(CƱ~ Ö -Ø`iQi·X®ê+ǽú=F~,P¯ñÖ¢I`1Ù(gm¨ézq·ÙdÉç÷â$G3 p`ºüFÍñ,Þ*wä1Ó´/-¤Ä%iè0q»cN9éù,28Q(äÍ®u¯2RìÈEjÝÝéVPä9ïûñýËÑØe¶zw}1IÞKÀìTwnÈlíâBígÀæ©ë tc!Å\:\¼àImÒp#é;Ãx\d¿Âuº¤_æ»ÜðâÀ gÓ¶põ¹ð[£0îZjFQ@jòYéÔºÃFö;áXWúbÞMÍ¿ZÀänVi\gÍ˸(y ¹*. LùÔ:h1 'T@YuUµèßdísUg©aêÁ Á¯W²BPÚìÒä÷¸2ìݳ+½G¨ÇCXav_P ûîãâ(Ã{Îïåûv2®SÝÇCLÚñËñfq/\?Ó+Q¬ý¼ÁÃmrÏä@6VÔQ?â7ëoË÷ÎçºWsLxzE/ Èt¿F(6gð1ýéÄ+$M:ÒE%ô×h¶ÌþÚL°_n55IF¬ûÀXíAÛº îl¢£ãs{j5ç×Ïj¬¥x©¿Pµü*æV4³¥¹÷ü"þCKù ê(hFÔW£b/÷aÁ ¥ ݧgÌBñ6¼ÜýgMDê©ñg@¯Z?zµÙiS|ù?G§ r¿'ÇÑZebë«ÂuÌäÐí^øIs:zßß0TD¾ ¯ÁÒ{·\È»¶ 14wºí)ÖN\=¹JjRqíOA¦Å±pp©èWS·Øeý»|£ôDÛ¹]7o{#A£Öpf'«cx·NÈúd§ÌO´·5Ú>פ×>ngæcN9×ßH½G´°®Ã(£ç»£=Z5Ø!²ÓL¾q£ø£Güf³Ð¤'YüÇE÷¡ò2øÎÂoB<ÆïFq:Ô¯ß=Öð@Ëûü ¬é_¸Õ DzúQþcU@4¶òð§ Xé/qoÍddGqèKXZeù@'¦Ë 'ok&ÈEÏùÄݯæóOY¿|½2Ê.¯^®´*6ÁNv«s¿¦(áHm¾!ǸoàbON ¥¶Nˬ0 êj ¡±3EÉLÊI] 2$8=îÁMá¡Ü{Óã, ðWK,g^Õ÷Åù(À$;í=89 Ë îÄìDZ$" _JûÜ¥$¥ü©®µÆ±÷å9K#"°¯Ó3yJeêÝã@çöh/¦µ%ço³jÊIn9S<ÑȬþ8ZuC&1"ܶñ òÒkÒH'_pUú)T_Õwï'UÙ5nCÁBVîgO¿LèÚeÚ©S±>ÃÓ*u[z¸pØ/È2üûRç÷)ÊråUÙü?^²~7ÇÒÆ Ž®ÅóDºwgãåúVF½s¥Büü»âê÷r$Í3M¸wvzg hÈYºm¸·©Ìaí[æ&¦H²òBÐë,¬ÇGY KOt%O//õ¶ÆSæÐϼ$YêH¨¿?øôÀÛi ]iëë(7ÎGNÀÄä·kHÀÊÿ½*kN}mÚÅT¢w)¼Èò[paúe&çå z0ö g*Ût¤° eé¼ÿHd7ÃÂ@ÙUP¶xÚÔc©Òà} cZA$QÐ{5hÞ¥Þ¿hµõàíÛI¬Ýºyrv($fLþDü§Ì"öå<Æs¹,àÄÿmHIââa£Ô°¦v»uðx]\ÜQ§ÍØVEñàPÕþT{E·þ¸^UÛ<#Âù÷f»'b*3Ïßê|J ]sAlE$sÔæ iFñ÷b¡ ùÔ)_üô.ÎkO/èØ1JÉ+(\På´r;'ËΤ»&òÔdKQßH!aPÉES°´ýüMÊA»¼;mEâÃ1tðle /w úRëqÔM.`Røþäú¡ï6#Bߤt&SáümÚæg<ÓÑ[£ÿÍ]Àxn³£Xfzêï=Hp'sw$-$g@ü}þÙ8Ëeu·#wâOãÞ ¯IÔltA ´ÈÒßm> År×[.:ÌË<`7ñAó÷p@¸ºðRvÇ?á: lí[ïk1IÀ.§Bä/:÷rdDÔ¢3 U¿æJJãàè¥v~ö¯Ñ
РазÑабоÑка Subversion бÑла наÑаÑа в 2000 Ð³Ð¾Ð´Ñ Ð¿Ð¾ иниÑиаÑиве и пÑи ÑинанÑовой поддеÑжке CollabNet. ÐниÑиаÑоÑÑ Ð¿ÑоекÑа Ñ Ð¾Ñели ÑоздаÑÑ ÑвободнÑÑ ÑиÑÑÐµÐ¼Ñ ÑпÑÐ°Ð²Ð»ÐµÐ½Ð¸Ñ Ð²ÐµÑÑиÑми, в оÑновном Ð¿Ð¾Ñ Ð¾Ð¶ÑÑ Ð½Ð° CVS, но лиÑÑннÑÑ ÐµÑ Ð¾Ñибок и неÑдобÑÑв. Ð Ñо вÑÐµÐ¼Ñ Ð½Ðµ ÑÑÑеÑÑвовало лÑÑÑÐ¸Ñ Ð¿ÑогÑамм ÑÑого клаÑÑа Ñо Ñвободной лиÑензией, CVS бÑла ÑÑандаÑÑом де-ÑакÑо ÑÑеди ÑазÑабоÑÑиков Ñвободного пÑогÑаммного обеÑпеÑениÑ. ÐÑбÑав ÐµÑ Ð·Ð° оÑновÑ, ÑазÑабоÑÑики Subversion надеÑлиÑÑ ÑпÑоÑÑиÑÑ ÑазÑабоÑÐºÑ Ð·Ð° ÑÑÑÑ Ð¸ÑполÑÐ·Ð¾Ð²Ð°Ð½Ð¸Ñ Ñже пÑовеÑеннÑÑ ÐºÐ¾Ð½ÑепÑий и в Ñо же вÑÐµÐ¼Ñ Ð¾Ð±Ð»ÐµÐ³ÑиÑÑ Ð¿ÐµÑÐµÑ Ð¾Ð´ на новÑÑ ÑиÑÑÐµÐ¼Ñ Ð¼Ð½Ð¾Ð³Ð¾ÑиÑленнÑм полÑзоваÑелÑм CVS.[15]