Re: Swift: How to determine if a Character represents whitespace?

2015-04-06 Thread Roland King
I have no idea how a linguistic tagger determines whitespace and whether it 
uses the same definition for whitespace as NSCharacterSet does. Given that it's 
multi-language-aware I wouldn't be shocked to find it uses some entirely 
different way of enumerating textual elements. 

 On 6 Apr 2015, at 20:29, Gerriet M. Denkmann gerri...@icloud.com wrote:
 
 
 On 4 Apr 2015, at 16:13, cocoa-dev-requ...@lists.apple.com wrote:
 
 ok here’s my try, assuming NSLinguisticTagger knows what it’s doing. And yes 
 it’s a bit stupid to use a linguistic tagger to do something like that but 
 .. whatever 
 
 Linguistic Tagger should use the same definition for white as 
 NSCharacterSet.whitespaceCharacterSet.
 If this is so, this would work for all characters (even if their Unicode code 
 point does NOT fit into an unsigned short):
 
 import Cocoa
 
 let whiteSet = NSCharacterSet.whitespaceCharacterSet()
 let testString =  ...
 
 var i : Int = 0
 for scalar in testString.unicodeScalars
 {
   let uChar : UTF32Char = scalar.value
   let isWhite = whiteSet.longCharacterIsMember(uChar)
   let note = isWhite ?  whiteSpace  :  non white  
   
   var stringWithScalar =   
   stringWithScalar.append(scalar)
 
   let indexFormated = NSString(format: %2d, i++)
 
   let codePoint = scalar.value//  UInt32
   let hexFormated = NSString(format: %#07x, codePoint)
   
   println( codePoint[ + indexFormated + ] =  + hexFormated + note + 
 stringWithScalar)
 }
 
 Gerriet.
 
 
 


___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Swift: How to determine if a Character represents whitespace?

2015-04-03 Thread Charles Jenkins
I imagine you’re right, that they’re NString indexes packaged up into a 
frustrating return type. After sleeping on it, though, I imagined that even if 
complex grapheme clusters WERE to make count( attrStr.string ) return a 
different result than attrStr.length, it would probably never be due to 
whitespace. So if I go back to Charles Strstka’s original suggestion, where you 
pull off one character at a time, convert it to a 1-Character string, and then 
test for whitespace membership, I should be able to count leading and trailing 
whitespace characters and then do math based on attrStr.length to create the 
range.

Here’s my current playground: 

import Cocoa

extension Character {

  func isMemberOfSet( set:NSCharacterSet )
    - Bool
  {
    // The for loop only executes once;
    // its purpose is to convert Character to a type
    // you can actually do something with
    for char in String( self ).utf16 {
      if set.characterIsMember( char ) {
        return true
      }
    }
    return false
  }

}

var result:NSRange

let whitespace = NSCharacterSet.whitespaceAndNewlineCharacterSet()

let attrStr = NSAttributedString( string:    Fourscore and seven years ago... 
\n\n \t\t )
let str = attrStr.string

var headCount = 0
var tailCount = 0

var startIx = str.startIndex
var endIx = str.endIndex

while endIx  startIx  str[ endIx.predecessor() ].isMemberOfSet( whitespace ) 
{
  ++tailCount
  endIx = endIx.predecessor()
}
if endIx  startIx {
  while str[ startIx ].isMemberOfSet( whitespace ) {
    ++headCount
    startIx = startIx.successor()
  }
  let length = attrStr.length - ( headCount + tailCount )
  result = NSRange( location:headCount, length:length )
} else {
  // String was empty or all whitespace
  result = NSRange( location:0, length:0 )
}

let resultString = attrStr.attributedSubstringFromRange( result )


— 

Charles

On April 2, 2015 at 11:16:52 PM, Quincey Morris 
(quinceymor...@rivergatesoftware.com) wrote:

On Apr 2, 2015, at 19:28 , Charles Jenkins cejw...@gmail.com wrote:

I can indeed call attrStr.string.rangeOfCharacterFromSet(). But in typical 
Swift string fashion, the return type is as unfriendly as possible: 
RangeString.Index? — as if the NSString were a Swift string.

I finally read the whole of what you said here, and I had to run to a 
playground to check:

import Cocoa

var strA = Hello?, String”
var strB = Hello?, String as NSString
var strC = Hello\u{1f650}, String”
var strD = Hello\u{1f650}, NSString as NSString
var rangeA = 
strA.rangeOfCharacterFromSet(NSCharacterSet.whitespaceCharacterSet()) // {Some 
“7..8”}
var rangeB = 
strB.rangeOfCharacterFromSet(NSCharacterSet.whitespaceCharacterSet()) // (7,1)
var rangeC = 
strC.rangeOfCharacterFromSet(NSCharacterSet.whitespaceCharacterSet()) // {Some 
“8..9”}
var rangeD = 
strD.rangeOfCharacterFromSet(NSCharacterSet.whitespaceCharacterSet()) // (8,1)

So, yes, these are NSString indexes all the way, even if the result is packaged 
as a RangeString.Index.

___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Swift: How to determine if a Character represents whitespace?

2015-04-03 Thread Marco S Hyman

 extension Character {
 
   func isMemberOfSet( set:NSCharacterSet )
 - Bool
   {
 // The for loop only executes once;
 // its purpose is to convert Character to a type
 // you can actually do something with
 for char in String( self ).utf16 {
   if set.characterIsMember( char ) {
 return true
   }
 }
 return false
   }
 
 }

I believe your comment that the loop executes once is incorrect.   It may
execute more than once when the Character is a composed character that
maps to multiple utf16 characters.

Example (stolen from this link):
  
http://stackoverflow.com/questions/27697508/nscharacterset-characterismember-with-swifts-character-type


let acuteA: Character = \u{e1}   // An a with an accent
let acuteAComposed: Character = \u{61}\u{301}// Also an a with an accent

Both are a single Character.  The latter will cause the loop to iterate twice.

Marc
___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Swift: How to determine if a Character represents whitespace?

2015-04-03 Thread Quincey Morris
On Apr 3, 2015, at 04:00 , Charles Jenkins cejw...@gmail.com wrote:
 
 for char in String( self ).utf16 {
   if set.characterIsMember( char ) {
 return true
   }

Now we’re back to the place we started. This code is wrong. It fails for any 
code point that isn’t representable a single UTF-16 code value, and it fails 
for any grapheme that isn’t representable as a single code point.

This is what I would do (playground version):

 import Cocoa
 
 let notWhitespace = 
 NSCharacterSet.whitespaceAndNewlineCharacterSet().invertedSet
 let attrStr = NSAttributedString( string:Fourscore and seven years 
 ago... \n\n \t\t )
 let str = attrStr.string as NSString
 
 let startRange = str.rangeOfCharacterFromSet(notWhitespace, options: 
 NSStringCompareOptions.allZeros)
 let endRange = str.rangeOfCharacterFromSet(notWhitespace, options: 
 NSStringCompareOptions.BackwardsSearch)
 
 let startIndex = startRange.length != 0 ? startRange.location : 0
 let endIndex = endRange.length != 0 ? endRange.location + 1 : str.length
 
 let resultRange = NSRange (location: startIndex, length: endIndex - 
 startIndex)
 let resultStr = attrStr.attributedSubstringFromRange (resultRange)

It’s the Obj-C code, just written in Swift. The ‘as NSString’ in the 3rd line 
makes it work.

The practical difficulty in your original approach is that (e.g.) 
String.rangeOfCharacterFromSet returns a RangeString.Index, but AFAICT that 
isn’t convertible back to a NSRange, or even just integer indexes. At the same 
time, AFAICT it isn’t useful with a String because it doesn’t contain Character 
indexes, just unichar indexes, which have no meaning for a String in general.

Actually, my testing is with Swift 1.1, since I’m not in a position to move to 
Xcode 6.3 yet. It’s possible that the results are different in Swift 1.2. 
However, in the section of the release notes that talks about bridging between 
String and NSString, it says:

 Note that these Cocoa types in Objective-C headers are still automatically 
 bridged to their corresponding Swift type

so I suspect the results would be the same in 1.2. It seems to me there is an 
actual bug here:

“String methods corresponding to NSString methods that return NSRange values 
actually return RangeString.Index values, but these are not valid ranges, 
either for String objects (they represent UTF-16 code value positions, not 
Character positions) or for NSString objects (they’re not convertible back to 
NSRange). The String methods ought to return NSRange values just like their 
NSString counterparts.”



___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Swift: How to determine if a Character represents whitespace?

2015-04-03 Thread Marco S Hyman

 On Apr 3, 2015, at 11:04 AM, Quincey Morris 
 quinceymor...@rivergatesoftware.com wrote:
 
 On Apr 3, 2015, at 04:00 , Charles Jenkins cejw...@gmail.com wrote:
 
for char in String( self ).utf16 {
  if set.characterIsMember( char ) {
return true
  }
 
 Now we’re back to the place we started. This code is wrong. It fails for any 
 code point that isn’t representable a single UTF-16 code value, and it fails 
 for any grapheme that isn’t representable as a single code point.

No it doesn't.   Give it a test.

let acuteA: Character = \u{e1}   // An a with an accent
let acuteAComposed: Character = \u{61}\u{301}// Also an a with an accent

func howManyChars( c: Character) - Int {
var count = 0
for char in String( c ).utf16 {
count += 1
}
return count
}

howManyChars(acuteA)// returns 1
howManyChars(acuteAComposed)// returns 2

The original code will return true only if all code points map to white space.

Marc

___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Swift: How to determine if a Character represents whitespace?

2015-04-03 Thread Quincey Morris
On Apr 3, 2015, at 11:19 , Marco S Hyman m...@snafu.org wrote:
 
 The original code will return true only if all code points map to white space.

The “failure” I was talking about is something a bit different. It has two 
problems:

1. For Unicode code points that are represented by 2 code values, it tests the 
code values, not the code points. That’s wrong.

2. For graphemes that are represented by 2 or more code points, it still tests 
the code values, of which there could be 4 or more per grapheme. That’s also 
wrong. With the ‘for char in String (self)’ code, if you tested whether a 
decomposed acuteA was in the (7-bit) ASCII character set, you’d get the answer 
“YES.

You could mitigate #1 by using UTF-32 code values instead of UTF-16, but that 
wouldn’t help with #2.



___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Swift: How to determine if a Character represents whitespace?

2015-04-03 Thread Charles Jenkins
So my Character.isMemberOfSet() is a poor general-purpose method, and I need to 
ditch it.

I like your code. I had to modify it a bit so it wouldn’t fall on strings 
composed entirely of whitespace:

let testString =      \n\n \t\t
let attrStr = NSAttributedString( string:testString )
let str = attrStr.string as NSString

let notWhitespace = 
NSCharacterSet.whitespaceAndNewlineCharacterSet().invertedSet

var resultRange:NSRange

let startRange = str.rangeOfCharacterFromSet( notWhitespace, 
options:NSStringCompareOptions.allZeros )
if startRange.length  0 {
  let endRange = str.rangeOfCharacterFromSet( notWhitespace, 
options:NSStringCompareOptions.BackwardsSearch )
  let startIndex = startRange.location
  let endIndex = endRange.location + endRange.length
  resultRange = NSRange( location:startIndex, length:endIndex - startIndex )
} else {
  // String is empty or all whitespace
  resultRange = NSRange( location:0, length:0 )
}
let resultStr = attrStr.attributedSubstringFromRange( resultRange )
 
So, even though attrStr.string returns an NSString, you have use the “as” to 
explicitly keep the type and be able to do math on range indexes. Lacking that 
cast is what made rangeOfCharacterFromSet() useless to me yesterday.

Your code seems way better. but is there a way in the playground for use to 
test addresses to make sure attrStr.string as NSString doesn’t perform a copy? 

— 

Charles

On April 3, 2015 at 2:04:01 PM, Quincey Morris 
(quinceymor...@rivergatesoftware.com) wrote:

On Apr 3, 2015, at 04:00 , Charles Jenkins cejw...@gmail.com wrote:

    for char in String( self ).utf16 {
      if set.characterIsMember( char ) {
        return true
      }

Now we’re back to the place we started. This code is wrong. It fails for any 
code point that isn’t representable a single UTF-16 code value, and it fails 
for any grapheme that isn’t representable as a single code point.

This is what I would do (playground version):

import Cocoa

let notWhitespace = 
NSCharacterSet.whitespaceAndNewlineCharacterSet().invertedSet
let attrStr = NSAttributedString( string:    Fourscore and seven years ago... 
\n\n \t\t )
let str = attrStr.string as NSString

let startRange = str.rangeOfCharacterFromSet(notWhitespace, options: 
NSStringCompareOptions.allZeros)
let endRange = str.rangeOfCharacterFromSet(notWhitespace, options: 
NSStringCompareOptions.BackwardsSearch)

let startIndex = startRange.length != 0 ? startRange.location : 0
let endIndex = endRange.length != 0 ? endRange.location + 1 : str.length

let resultRange = NSRange (location: startIndex, length: endIndex - startIndex)
let resultStr = attrStr.attributedSubstringFromRange (resultRange)

It’s the Obj-C code, just written in Swift. The ‘as NSString’ in the 3rd line 
makes it work.

The practical difficulty in your original approach is that (e.g.) 
String.rangeOfCharacterFromSet returns a RangeString.Index, but AFAICT that 
isn’t convertible back to a NSRange, or even just integer indexes. At the same 
time, AFAICT it isn’t useful with a String because it doesn’t contain Character 
indexes, just unichar indexes, which have no meaning for a String in general.

Actually, my testing is with Swift 1.1, since I’m not in a position to move to 
Xcode 6.3 yet. It’s possible that the results are different in Swift 1.2. 
However, in the section of the release notes that talks about bridging between 
String and NSString, it says:

Note that these Cocoa types in Objective-C headers are still automatically 
bridged to their corresponding Swift type

so I suspect the results would be the same in 1.2. It seems to me there is an 
actual bug here:

“String methods corresponding to NSString methods that return NSRange values 
actually return RangeString.Index values, but these are not valid ranges, 
either for String objects (they represent UTF-16 code value positions, not 
Character positions) or for NSString objects (they’re not convertible back to 
NSRange). The String methods ought to return NSRange values just like their 
NSString counterparts.”

___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Swift: How to determine if a Character represents whitespace?

2015-04-03 Thread Quincey Morris
On Apr 3, 2015, at 13:18 , Charles Jenkins cejw...@gmail.com wrote:
 
 is there a way in the playground for use to test addresses to make sure 
 attrStr.string as NSString doesn’t perform a copy? 

I doubt it. This is the best I could come up with in a couple of minutes:

 import Cocoa
 
 let notWhitespace = 
 NSCharacterSet.whitespaceAndNewlineCharacterSet().invertedSet
 let count = 50
 
 let aString: String = String (count: count, repeatedValue: Character (A))
 let aNSString: NSString = ( as NSString).stringByPaddingToLength (count, 
 withString: A, startingAtIndex: 0)
 
 let date1 = NSDate ()
 let bString: String = aNSString as String
 let date2 = NSDate ()
 let time2 = date2.timeIntervalSinceDate(date1)
 
 let date3 = NSDate ()
 let bNSString: NSString = aString as NSString
 let date4 = NSDate ()
 let time4 = date4.timeIntervalSinceDate(date3)
 
 let attrStr = NSAttributedString (string: aNSString)
 
 let date5 = NSDate ()
 let range5 = attrStr.string.rangeOfCharacterFromSet(notWhitespace, options: 
 NSStringCompareOptions.allZeros)
 let date6 = NSDate ()
 let time6 = date6.timeIntervalSinceDate(date5)
 
 let date7 = NSDate ()
 let range7 = (attrStr.string as 
 NSString).rangeOfCharacterFromSet(notWhitespace, options: 
 NSStringCompareOptions.allZeros)
 let date8 = NSDate ()
 let time8 = date8.timeIntervalSinceDate(date7)

Playground results:

time2: 0.3328
time4: 0.1817
time6: 0.0022
time8: 0.0017

Since the “rangeOfCharacter” scans terminate at the first character, this 
suggests that there’s no real conversion in the last case, which is the one 
you’re interested in. (Also, time6 and time8 don’t vary with the value of 
‘count’.) However, generalizing from this seems treacherous. And I may have 
just Done It Wrong™.



___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Swift: How to determine if a Character represents whitespace?

2015-04-03 Thread Roland King
ok here’s my try, assuming NSLinguisticTagger knows what it’s doing. And yes 
it’s a bit stupid to use a linguistic tagger to do something like that but .. 
whatever 

var str = Some String WIth Whitespace 

var lt = NSLinguisticTagger( tagSchemes: [NSLinguisticTagSchemeTokenType], 
options: 0 )
lt.string = str

var endsWithWhitespace = ( lt.tagAtIndex( (str as NSString).length-1, scheme: 
NSLinguisticTagSchemeTokenType, tokenRange: nil, sentenceRange: nil ) == 
NSLinguisticTagOtherWhitespace )


___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Swift: How to determine if a Character represents whitespace?

2015-04-02 Thread Quincey Morris
On Apr 2, 2015, at 19:28 , Charles Jenkins cejw...@gmail.com wrote:
 
 I can indeed call attrStr.string.rangeOfCharacterFromSet(). But in typical 
 Swift string fashion, the return type is as unfriendly as possible: 
 RangeString.Index? — as if the NSString were a Swift string.

I finally read the whole of what you said here, and I had to run to a 
playground to check:

 import Cocoa
 
 var strA = Hello?, String”
 var strB = Hello?, String as NSString
 var strC = Hello\u{1f650}, String”
 var strD = Hello\u{1f650}, NSString as NSString
 var rangeA = 
 strA.rangeOfCharacterFromSet(NSCharacterSet.whitespaceCharacterSet()) // 
 {Some “7..8”}
 var rangeB = 
 strB.rangeOfCharacterFromSet(NSCharacterSet.whitespaceCharacterSet()) // (7,1)
 var rangeC = 
 strC.rangeOfCharacterFromSet(NSCharacterSet.whitespaceCharacterSet()) // 
 {Some “8..9”}
 var rangeD = 
 strD.rangeOfCharacterFromSet(NSCharacterSet.whitespaceCharacterSet()) // (8,1)

So, yes, these are NSString indexes all the way, even if the result is packaged 
as a RangeString.Index.



___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Swift: How to determine if a Character represents whitespace?

2015-04-02 Thread Quincey Morris
On Apr 2, 2015, at 19:28 , Charles Jenkins cejw...@gmail.com wrote:
 
 So after doing two anchored searches, one at the beginning and one at the end 
 of the string, if I get two different ranges, I’m stuck with two values that 
 aren’t subtractable to determine the length of the NSRange I need in a call 
 to attributedSubstringFromRange(). 

Not at all. All of this API is *NSString* API, even if the instance happens to 
be String instead of NSString, so the ranges are NSString-compatible ranges 
(i.e. UTF16 code value ranges), so you can just do the subtraction and use the 
result in attributedSubstringFromRange.

 I think the safest thing for me to do for attributed string compatibility is 
 give up on Swift purity and put my range-trimming function in an Objective-C 
 file.


Again, it’s all NSString API, so the results are what the Obj-C API would 
return. Otherwise, interoperability wouldn’t work.

If, additionally, you cast any String-returning result ‘as’ NSString, then you 
literally are doing Obj-C, though it happens to be created by the Swift 
compiler. That is to say, it’s going to make Obj-C-style method calls with an 
Obj-C-NSString-style object as receiver, so the source language is irrelevant. 
(!)

That, or I’ve run wildly off the rails.



___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Swift: How to determine if a Character represents whitespace?

2015-04-02 Thread Charles Jenkins
Amen, brother.

Given my attributed string “attrStr,” I can indeed call 
attrStr.string.rangeOfCharacterFromSet(). But in typical Swift string fashion, 
the return type is as unfriendly as possible: RangeString.Index? — as if the 
NSString were a Swift string.

So after doing two anchored searches, one at the beginning and one at the end 
of the string, if I get two different ranges, I’m stuck with two values that 
aren’t subtractable to determine the length of the NSRange I need in a call to 
attributedSubstringFromRange(). I think the safest thing for me to do for 
attributed string compatibility is give up on Swift purity and put my 
range-trimming function in an Objective-C file.

— 

Charles

On April 2, 2015 at 2:15:07 PM, Quincey Morris 
(quinceymor...@rivergatesoftware.com) wrote:

On Apr 2, 2015, at 04:54 , Charles Jenkins cejw...@gmail.com wrote:

Swift has a built-in func stringByTrimmingCharactersInSet(set: NSCharacterSet) 
- String  

There is something wacky going on here — or not. (I know you don’t want to use 
this particular method, but I’m just using it as an example.)

First of all, String and NSString are different classes, for real. Quoting a 
god-like personage, in a recent thread:

On Mar 23, 2015, at 13:52 , Greg Parker gpar...@apple.com wrote:

Most of NSString's methods are re-implemented in a Swift extension on class 
String. You get this extension when you `import Cocoa`.

And indeed, if you try this in a playground:

let strA = Hello, string
let strB = Hello, NSString as NSString
let a = strA.characterAtIndex (6) // line 3
let b = strB.characterAtIndex (6) // line 4

you get an error at line 3, as you would expect/hope (since Strings aren’t 
“made of” unichars), but no error in line 4 (since NSStrings are).

So it’s not odd that String.stringByTrimmingCharactersInSet would return a 
String. What’s very odd is that *in Swift* 
NSString.stringByTrimmingCharactersInSet returns a String — not a NSString — as 
does NSAttributedString.string, or apparently any Cocoa API that would return a 
NSString in Obj-C.

This means it’s not possible *in Swift* to apply NSString methods to a NSString 
and stay entirely within the NSString world without casting/converting. 
*That’s* wacky, given that String and NSString are different classes with 
different (though very similar) APIs.

The only way to un-wack this, that I can think of right now, would be for 
expressions like ‘someNSString.stringByTrimmingCharactersInSet (…) as NSString’ 
to involve only a cheap or free conversion from String to NSString. However 
there is no API contract to this effect AFAIK.

Therefore:

1. We need a god-like personage to step in and un-wack this for real.

2. Subject to the outcome of #1, you can approach this entirely in the NSString 
world, in which case I like Uli’s suggestion, applied to 
'yourAttributedString.string as NSString’. You’d have to verify by performance 
testing that massive conversions aren’t being made.

___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Swift: How to determine if a Character represents whitespace?

2015-04-02 Thread Charles Jenkins
I kept my original question as brief as I could, but let me tell you what 
problem I’m trying to solve, and maybe someone will have good advice I haven’t 
yet considered.

I’m trying to code in pure Swift. I have an NSAttributedString which can 
potentially be very large, and I want to save off the 
attributedSubstringFromRange: which represents the string with leading and 
trailing whitespace trimmed. I’m trying to avoid copying the giant string 
merely to determine the proper substring range for copying it again.

Swift has a built-in func stringByTrimmingCharactersInSet(set: NSCharacterSet) 
- String which won’t help me because using it would copy the string and 
discard the attributes. Even using it for length-testing wouldn’t work, because 
I have no way to know how many characters were trimmed off the head versus the 
tail of the string.

What would be nice is a way to count leading and trailing characters in place 
while the thing is still an NSAttributedString--without using 
NSAttributedString.string to convert to a Swift string in the first place. If 
there were no conversion to the unicode-compliant and amazingly 
difficult-to-do-anything-with-it Swift string, I’d be more confident that the 
shrunken range I calculate would be apples to apples.

-- 

Charles

On April 2, 2015 at 01:25:40, Quincey Morris 
(quinceymor...@rivergatesoftware.com) wrote:

On Apr 1, 2015, at 21:17 , Charles Jenkins cejw...@gmail.com wrote:

for ch in String(char).utf16 {  
if !set.characterIsMember(ch) { found = false }  
}  

Except that this code can’t possibly be right, in general. 

1. A ‘unichar’ is a UTF-16 code value, but it’s not a Unicode code point. Some 
UTF-16 code values have no meaning as “characters” by themselves. I think you 
could mitigate this problem by using ‘longCharacterIsMember’, which takes a 
UTF-32 code value instead (and enumerating the string as UTF-32 instead of 
UTF-16).

2. A Swift ‘Character’ isn’t a Unicode code point, but rather a grapheme. That 
is, it might be a sequence of code points (and I mean code points, not code 
values). It might be such a sequence either because there’s no way of 
representing the grapheme by a single code point, or because it’s a composed 
character made up of a base code points and some combining characters.

In this case, you can’t validly test the individual code points for membership 
of the character set.

I’m not sure, but I suspect the underlying obstacle is that NSCharacterSet is 
at best a set of code points, and you cannot test a grapheme for membership of 
a set of code points.

In your particular application, if it’s true that all** Unicode whitespace 
characters are represented as a single code point (via a single UTF-32 code 
value), or a single UTF-16 code value, then you can get away with one of the 
above solutions. Otherwise you’re going to need a more complex solution, that 
doesn’t involve NSCharacterSet at all.



** Or at least the ones you happen to care about, but ignoring the others may 
be a perilous proceeding.

___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Swift: How to determine if a Character represents whitespace?

2015-04-02 Thread Uli Kusterer
On 02 Apr 2015, at 13:54, Charles Jenkins cejw...@gmail.com 
mailto:cejw...@gmail.com wrote:
 What would be nice is a way to count leading and trailing characters in place 
 while the thing is still an NSAttributedString--without using 
 NSAttributedString.string to convert to a Swift string in the first place. If 
 there were no conversion to the unicode-compliant and amazingly 
 difficult-to-do-anything-with-it Swift string, I’d be more confident that the 
 shrunken range I calculate would be apples to apples.


Does Swift have an equivalent to rangeOfCharacterFromSet:options: or would that 
require converting it to NSString?

Because you could just generate the inverse NSCharacterSet to the whitespace 
character set, and then look for the first (NSAnchoredSearch) and last 
(NSAnchoredSearch | NSBackwardsSearch) non-whitespace character, and then 
extract only the range between those two offsets.

Wildly guessing,
-- Uli Kusterer
http://stacksmith http://stacksmith/.org
___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Swift: How to determine if a Character represents whitespace?

2015-04-02 Thread Ken Thomases
On Apr 2, 2015, at 6:54 AM, Charles Jenkins cejw...@gmail.com wrote:

 What would be nice is a way to count leading and trailing characters in place 
 while the thing is still an NSAttributedString--without using 
 NSAttributedString.string to convert to a Swift string in the first place.

NSAttributedString.string does not involve a conversion.  The underlying string 
is part of NSAttributedString's data model.  The documentation for the method 
explicitly says, For performance reasons, this property returns the current 
backing store of the attributed string object.

I don't know if there's a conversion to create a Swift string from that, but 
you don't have to.  I believe you can work with NSString in Swift.

Regards,
Ken


___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Swift: How to determine if a Character represents whitespace?

2015-04-02 Thread Jens Alfke

 On Apr 2, 2015, at 4:54 AM, Charles Jenkins cejw...@gmail.com wrote:
 
 What would be nice is a way to count leading and trailing characters in place 
 while the thing is still an NSAttributedString--without using 
 NSAttributedString.string to convert to a Swift string in the first place. If 
 there were no conversion to the unicode-compliant and amazingly 
 difficult-to-do-anything-with-it Swift string, I’d be more confident that the 
 shrunken range I calculate would be apples to apples.

Use NSString.rangeOfCharactersFromSet() on the attributed string’s underlying 
NSString.

Don’t use any native Swift String character accessors, because the character 
positions aren’t going to agree with NSString since they use different 
interpretations of Unicode.

—Jens
___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Swift: How to determine if a Character represents whitespace?

2015-04-02 Thread Charles Jenkins
The documentation certainly says that, Ken, but stick this code in a playground 
and see that you can’t examine the characters via index no matter whether you 
assume it to be String or NSString:

let whitespaceSet = NSCharacterSet.whitespaceAndNewlineCharacterSet()
let attrStr = NSAttributedString( string:    Fourscore and seven years ago   
\n\n\n\t\t\t    )
let str = attrStr.string

var head = 0
let tooFar = attrStr.length
while head  tooFar {
  if whitespaceSet.characterIsMember( str.characterAtIndex( head ) ) {
    // Skip -- I did it this way so the error message received from the above 
line will be clear
  } else {
    break;
  }
  ++head
}

var headIx = str.startIndex
let tooFarIx = str.endIndex
while headIx  tooFarIx {
  if whitespaceSet.characterIsMember( str[ headIx ] ) {
    // Skip
  } else {
    break;
  }
  headIx = headIx.successor()
}

characterAtIndex() doesn’t work because it’s not available in Swift. If you 
replace str.characterAtIndex( head ) with with str[ head ], you get the same 
error as in the version below it that incorrectly assumes it’s a Swift string: 
“Could not find overload of 'subscript' that accepts the supplied arguments.”

Now, I did just type this out on a computer running Xcode 6.2. At home I’m 
using 6.3 beta, so it’s possible I’ll get home and find one of these versions 
works as expected, even though I’m sure I tried both ways last night when I 
first hit the roadblock…

I’m now guessing that maybe converting from NSString to String and examining 
characters via one of the UTF views might possibly not involve a copy. But then 
how do I decide which view I should be using...

-- 

Charles

On April 2, 2015 at 08:44:52, Ken Thomases (k...@codeweavers.com) wrote:

On Apr 2, 2015, at 6:54 AM, Charles Jenkins cejw...@gmail.com wrote:  

 What would be nice is a way to count leading and trailing characters in place 
 while the thing is still an NSAttributedString--without using 
 NSAttributedString.string to convert to a Swift string in the first place.  

NSAttributedString.string does not involve a conversion. The underlying string 
is part of NSAttributedString's data model. The documentation for the method 
explicitly says, For performance reasons, this property returns the current 
backing store of the attributed string object.  

I don't know if there's a conversion to create a Swift string from that, but 
you don't have to. I believe you can work with NSString in Swift.  

Regards,  
Ken  

___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Swift: How to determine if a Character represents whitespace?

2015-04-02 Thread Charles Jenkins
Oops. My documentation viewer was set up wrong. characterAtIndex() is indeed 
supposed to be available in Swift. Don’t know what I’ve done wrong that I can’t 
use it in a playground.

-- 

Charles

On April 2, 2015 at 10:18:00, Charles Jenkins (cejw...@gmail.com) wrote:

The documentation certainly says that, Ken, but stick this code in a playground 
and see that you can’t examine the characters via index no matter whether you 
assume it to be String or NSString:

let whitespaceSet = NSCharacterSet.whitespaceAndNewlineCharacterSet()
let attrStr = NSAttributedString( string:    Fourscore and seven years ago   
\n\n\n\t\t\t    )
let str = attrStr.string

var head = 0
let tooFar = attrStr.length
while head  tooFar {
  if whitespaceSet.characterIsMember( str.characterAtIndex( head ) ) {
    // Skip -- I did it this way so the error message received from the above 
line will be clear
  } else {
    break;
  }
  ++head
}

var headIx = str.startIndex
let tooFarIx = str.endIndex
while headIx  tooFarIx {
  if whitespaceSet.characterIsMember( str[ headIx ] ) {
    // Skip
  } else {
    break;
  }
  headIx = headIx.successor()
}

characterAtIndex() doesn’t work because it’s not available in Swift. If you 
replace str.characterAtIndex( head ) with with str[ head ], you get the same 
error as in the version below it that incorrectly assumes it’s a Swift string: 
“Could not find overload of 'subscript' that accepts the supplied arguments.”

Now, I did just type this out on a computer running Xcode 6.2. At home I’m 
using 6.3 beta, so it’s possible I’ll get home and find one of these versions 
works as expected, even though I’m sure I tried both ways last night when I 
first hit the roadblock…

I’m now guessing that maybe converting from NSString to String and examining 
characters via one of the UTF views might possibly not involve a copy. But then 
how do I decide which view I should be using...

-- 

Charles

On April 2, 2015 at 08:44:52, Ken Thomases (k...@codeweavers.com) wrote:

On Apr 2, 2015, at 6:54 AM, Charles Jenkins cejw...@gmail.com wrote:

 What would be nice is a way to count leading and trailing characters in place 
 while the thing is still an NSAttributedString--without using 
 NSAttributedString.string to convert to a Swift string in the first place.

NSAttributedString.string does not involve a conversion. The underlying string 
is part of NSAttributedString's data model. The documentation for the method 
explicitly says, For performance reasons, this property returns the current 
backing store of the attributed string object.

I don't know if there's a conversion to create a Swift string from that, but 
you don't have to. I believe you can work with NSString in Swift.

Regards,
Ken

___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Swift: How to determine if a Character represents whitespace?

2015-04-02 Thread Quincey Morris
On Apr 2, 2015, at 04:54 , Charles Jenkins cejw...@gmail.com wrote:

 Swift has a built-in func stringByTrimmingCharactersInSet(set: 
 NSCharacterSet) - String  

There is something wacky going on here — or not. (I know you don’t want to use 
this particular method, but I’m just using it as an example.)

First of all, String and NSString are different classes, for real. Quoting a 
god-like personage, in a recent thread:

 On Mar 23, 2015, at 13:52 , Greg Parker gpar...@apple.com wrote:
 
 Most of NSString's methods are re-implemented in a Swift extension on class 
 String. You get this extension when you `import Cocoa`.

And indeed, if you try this in a playground:

 let strA = Hello, string
 let strB = Hello, NSString as NSString
 let a = strA.characterAtIndex (6) // line 3
 let b = strB.characterAtIndex (6) // line 4


you get an error at line 3, as you would expect/hope (since Strings aren’t 
“made of” unichars), but no error in line 4 (since NSStrings are).

So it’s not odd that String.stringByTrimmingCharactersInSet would return a 
String. What’s very odd is that *in Swift* 
NSString.stringByTrimmingCharactersInSet returns a String — not a NSString — as 
does NSAttributedString.string, or apparently any Cocoa API that would return a 
NSString in Obj-C.

This means it’s not possible *in Swift* to apply NSString methods to a NSString 
and stay entirely within the NSString world without casting/converting. 
*That’s* wacky, given that String and NSString are different classes with 
different (though very similar) APIs.

The only way to un-wack this, that I can think of right now, would be for 
expressions like ‘someNSString.stringByTrimmingCharactersInSet (…) as NSString’ 
to involve only a cheap or free conversion from String to NSString. However 
there is no API contract to this effect AFAIK.

Therefore:

1. We need a god-like personage to step in and un-wack this for real.

2. Subject to the outcome of #1, you can approach this entirely in the NSString 
world, in which case I like Uli’s suggestion, applied to 
'yourAttributedString.string as NSString’. You’d have to verify by performance 
testing that massive conversions aren’t being made.



___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Swift: How to determine if a Character represents whitespace?

2015-04-01 Thread Quincey Morris
On Apr 1, 2015, at 21:17 , Charles Jenkins cejw...@gmail.com wrote:
 
 for ch in String(char).utf16 {  
 if !set.characterIsMember(ch) { found = false }  
 }  

Except that this code can’t possibly be right, in general. 

1. A ‘unichar’ is a UTF-16 code value, but it’s not a Unicode code point. Some 
UTF-16 code values have no meaning as “characters” by themselves. I think you 
could mitigate this problem by using ‘longCharacterIsMember’, which takes a 
UTF-32 code value instead (and enumerating the string as UTF-32 instead of 
UTF-16).

2. A Swift ‘Character’ isn’t a Unicode code point, but rather a grapheme. That 
is, it might be a sequence of code points (and I mean code points, not code 
values). It might be such a sequence either because there’s no way of 
representing the grapheme by a single code point, or because it’s a composed 
character made up of a base code points and some combining characters.

In this case, you can’t validly test the individual code points for membership 
of the character set.

I’m not sure, but I suspect the underlying obstacle is that NSCharacterSet is 
at best a set of code points, and you cannot test a grapheme for membership of 
a set of code points.

In your particular application, if it’s true that all** Unicode whitespace 
characters are represented as a single code point (via a single UTF-32 code 
value), or a single UTF-16 code value, then you can get away with one of the 
above solutions. Otherwise you’re going to need a more complex solution, that 
doesn’t involve NSCharacterSet at all.



** Or at least the ones you happen to care about, but ignoring the others may 
be a perilous proceeding.



___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Swift: How to determine if a Character represents whitespace?

2015-04-01 Thread Charles Srstka
On Apr 1, 2015, at 8:14 PM, Charles Jenkins cejw...@gmail.com wrote:
 
 Given this code:
 
 let someCharacter = str[str.endIndex.predecessor()]
 
 How can I determine if someCharacter is whitespace?

import Foundation

func isChar(char: Character, inSet set: NSCharacterSet) - Bool {
// this function is from an answer on StackOverflow:
// 
http://stackoverflow.com/questions/27697508/nscharacterset-characterismember-with-swifts-character-type
var found = true
for ch in String(char).utf16 {
if !set.characterIsMember(ch) { found = false }
}
return found
}

let str = foo 
let chr = str[str.endIndex.predecessor()]

let isWhitespace = isChar(chr, inSet: 
NSCharacterSet.whitespaceAndNewlineCharacterSet()) // true

Charles
___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Swift: How to determine if a Character represents whitespace?

2015-04-01 Thread Charles Jenkins
Thank you very much. :-) I had been trying to figure out how to use 
NSCharacterSet, but I didn’t know the bit about converting to UTF-16 string 
first.    

— 

Charles

On April 1, 2015 at 9:52:47 PM, Charles Srstka (cocoa...@charlessoft.com) wrote:

On Apr 1, 2015, at 8:14 PM, Charles Jenkins cejw...@gmail.com wrote:  
  
 Given this code:  
  
 let someCharacter = str[str.endIndex.predecessor()]  
  
 How can I determine if someCharacter is whitespace?  

import Foundation  

func isChar(char: Character, inSet set: NSCharacterSet) - Bool {  
// this function is from an answer on StackOverflow:  
// 
http://stackoverflow.com/questions/27697508/nscharacterset-characterismember-with-swifts-character-type
  
var found = true  
for ch in String(char).utf16 {  
if !set.characterIsMember(ch) { found = false }  
}  
return found  
}  

let str = foo   
let chr = str[str.endIndex.predecessor()]  

let isWhitespace = isChar(chr, inSet: 
NSCharacterSet.whitespaceAndNewlineCharacterSet()) // true  

Charles
___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com