[ 
https://issues.apache.org/jira/browse/UIMA-3017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13693013#comment-13693013
 ] 

Mike Barborak commented on UIMA-3017:
-------------------------------------

Hi,

I was using this non-JCas approach to try to simplify the test case. Your 
observation shows that I simplified it too much. So I have changed the test 
case to be a bit more complex and closer to reality. Here are the results:

Test 1. Get a string from an FS using non-JCas API
round 0 total time 1: 6.980455698s
round 1 total time 1: 6.739509176s
round 2 total time 1: 6.732438268s
round 3 total time 1: 6.738521584s
round 4 total time 1: 6.73231265s

Test 2. Get a string from a POJO via an internal HashMap<String, String>
round 0 total time 2: 0.93895504s
round 1 total time 2: 0.916541992s
round 2 total time 2: 0.918598733s
round 3 total time 2: 0.912975135s
round 4 total time 2: 0.899861037s

Test 3. Get a string from an FS using JCAS API
round 0 total time 3: 1.174062393s
round 1 total time 3: 1.159312484s
round 2 total time 3: 1.158318541s
round 3 total time 3: 1.158692163s
round 4 total time 3: 1.158886561s

Test 4. Get a member string from a POJO
round 0 total time 4: 0.072869196s
round 1 total time 4: 0.067882641s
round 2 total time 4: 0.067807927s
round 3 total time 4: 0.067737007s
round 4 total time 4: 0.067756583s

Test 5. Get a FS reference using the JCas API
round 0 total time 5: 10.611032036s
round 1 total time 5: 10.081770229s
round 2 total time 5: 10.143019053s
round 3 total time 5: 11.222992977s
round 4 total time 5: 10.186875851s

Test 6. Get a FS reference using the JCas API and caching 
jcasType.ll_cas.ll_getFSForRef in a HashMap
round 0 total time 6: 2.257271104s
round 1 total time 6: 2.232764892s
round 2 total time 6: 2.231799763s
round 3 total time 6: 2.234257571s
round 4 total time 6: 2.237339784s

First, I added code to reproduce your new set of times. This is test 3. As you 
can see, I didn't get your times likely because I didn't reproduce your test. 
But I actually think that comparing test 2 and 3 isn't as interesting as some 
other comparisons I make below.

Second, I added a string member to my POJO and then got this member through a 
method in a tight loop. This is test 4 and this is what I would expect to be 
similar to Test 3 (rather than test 2) but it is about 16 times faster.

Third, I added a feature to my type that is a reference to a feature structure. 
I timed getting the referenced feature structure in a tight loop. This is time 
5.

Fourth, I extended the MyType class generated with JCasGen as WrappedMyType. 
This class uses a HashMap<Integer, FeatureStructure> to cache the results of 
ll_getFSForRef(). I timed getting the referenced feature structure with this 
class in a tight loop and this is time 6. I would expect it to be similar to 
Test 5 but it is about 4 times faster.

Thank you for considering these numbers.

Here is my test class:

{code:title=test/MyTest.java}
package test;

import java.io.InputStream;
import java.util.HashMap;
import java.util.Map;

import org.apache.uima.UIMAFramework;
import org.apache.uima.cas.CAS;
import org.apache.uima.cas.Feature;
import org.apache.uima.cas.FeatureStructure;
import org.apache.uima.cas.Type;
import org.apache.uima.jcas.JCas;
import org.apache.uima.resource.metadata.TypeSystemDescription;
import org.apache.uima.util.CasCreationUtils;
import org.apache.uima.util.XMLInputSource;

import com.ibm.uima.matching.TestSupport;

public class MyTest {
  
  static class MyClass {
    String memberString;
    Map<String, String> myFeatures = new HashMap<String, String>();
    
    void setStringValue(String feature, String value) {
      myFeatures.put(feature, value);
    }
    
    String getStringValue(String feature) {
      return myFeatures.get(feature);
    }
    
    void setMemberStringValue(String value) {
      memberString = value;
    }
    
    String getMemberStringValue() {
      return memberString;
    }
  }
  
  static class WrappedMyType extends MyType {
    private Map<Integer, FeatureStructure> getFSForRefCache = new 
HashMap<Integer, FeatureStructure>();
    
    public WrappedMyType(JCas jCas) {
      super(jCas);
    }

    protected FeatureStructure getFSForRef(int fsRef) {
      if (fsRef == 0) return null;
      FeatureStructure fs = getFSForRefCache.get(fsRef);
      if (fs == null) {
        fs = jcasType.ll_cas.ll_getFSForRef(fsRef);
        getFSForRefCache.put(fsRef, fs);
      }
      return fs;
    }
    
    @Override
    public MyType getMyTypeRef() {
      if (MyType_Type.featOkTst && ((MyType_Type)jcasType).casFeat_myTypeRef == 
null)
        jcasType.jcas.throwFeatMissing("myTypeRef", "test.MyType");
      
      int refAddr = jcasType.ll_cas.ll_getRefValue(addr, 
((MyType_Type)jcasType).casFeatCode_myTypeRef);
      MyType result = (MyType) getFSForRef(refAddr);
      return result;
    }
  }
  
  static public void main(String[] argv) throws Exception {
    InputStream stream = 
TestSupport.class.getClassLoader().getResourceAsStream("MyTypes.xml");
    TypeSystemDescription typeSystemDescription = 
UIMAFramework.getXMLParser().parseTypeSystemDescription(new 
XMLInputSource(stream, null));
    CAS cas = CasCreationUtils.createCas(typeSystemDescription, null, null);
    Type myType = cas.getTypeSystem().getType("test.MyType");
    FeatureStructure fs = cas.createFS(myType);
    Feature myFeature = myType.getFeatureByBaseName("myFeature");
    fs.setStringValue(myFeature, "myString");
    cas.addFsToIndexes(fs);
    
    MyClass myInstance = new MyClass();
    myInstance.setStringValue("myFeature2", "myString2");
    myInstance.setMemberStringValue("myMemberString2");
    
    MyType myType2 = new MyType(cas.getJCas()); 
    myType2.setMyFeature("myString3");
    myType2.setMyTypeRef(myType2);
    myType2.addToIndexes();
    
    WrappedMyType wrappedMyType2 = new WrappedMyType(cas.getJCas()); 
    wrappedMyType2.setMyFeature("myString4");
    wrappedMyType2.setMyTypeRef(myType2);
    wrappedMyType2.addToIndexes();
    
    long iterations = 100000000;
    long rounds = 5;
    double nanoSecsPerSec = 1000000000.0d;
    
    System.out.println("Test 1. Get a string from an FS using non-JCas API");
    for (int round = 0; round < rounds; round++) {
      long start = System.nanoTime();
      for (long i = 0; i < iterations; i++) {
        fs.getStringValue(myFeature);
      }
      long end = System.nanoTime();
      System.out.println("round " + round + " total time 1: " + ((end - start) 
/ nanoSecsPerSec) + "s");
    }
      
    System.out.println("\nTest 2. Get a string from a POJO via an internal 
HashMap<String, String>");
    for (int round = 0; round < rounds; round++) {
      long start = System.nanoTime();
      for (long i = 0; i < iterations; i++) {
        myInstance.getStringValue("myFeature2");
      }
      long end = System.nanoTime();
      System.out.println("round " + round + " total time 2: " + ((end - start) 
/ nanoSecsPerSec) + "s");
    }
    
    System.out.println("\nTest 3. Get a string from an FS using JCAS API");
    for (int round = 0; round < rounds; round++) {
      long start = System.nanoTime();
      for (long i = 0; i < iterations; i++) {
        myType2.getMyFeature();
      }
      long end = System.nanoTime();
      System.out.println("round " + round + " total time 3: " + ((end - start) 
/ nanoSecsPerSec) + "s");
    }
    
    System.out.println("\nTest 4. Get a member string from a POJO");
    for (int round = 0; round < rounds; round++) {
      long start = System.nanoTime();
      for (long i = 0; i < iterations; i++) {
        myInstance.getMemberStringValue();
      }
      long end = System.nanoTime();
      System.out.println("round " + round + " total time 4: " + ((end - start) 
/ nanoSecsPerSec) + "s");
    }
    
    System.out.println("\nTest 5. Get a FS reference using the JCas API");
    for (int round = 0; round < rounds; round++) {
      long start = System.nanoTime();
      for (long i = 0; i < iterations; i++) {
        myType2.getMyTypeRef();
      }
      long end = System.nanoTime();
      System.out.println("round " + round + " total time 5: " + ((end - start) 
/ nanoSecsPerSec) + "s");
    }
    
    System.out.println("\nTest 6. Get a FS reference using the JCas API and 
caching jcasType.ll_cas.ll_getFSForRef in a HashMap");
    for (int round = 0; round < rounds; round++) {
      long start = System.nanoTime();
      for (long i = 0; i < iterations; i++) {
        wrappedMyType2.getMyTypeRef();
      }
      long end = System.nanoTime();
      System.out.println("round " + round + " total time 6: " + ((end - start) 
/ nanoSecsPerSec) + "s");
    }
  }

}
{code}

Here is my type system:

{code:xml}
<?xml version="1.0" encoding="UTF-8"?>
<typeSystemDescription xmlns="http://uima.apache.org/resourceSpecifier";>
  <name>MyTypes</name>
  <description/>
  <version>1.0</version>
  <vendor/>
  <types>
    <typeDescription>
      <name>test.MyType</name>
      <description/>
      <supertypeName>uima.cas.TOP</supertypeName>
      <features>
        <featureDescription>
          <name>myFeature</name>
          <description/>
          <rangeTypeName>uima.cas.String</rangeTypeName>
        </featureDescription>
        <featureDescription>
                <name>myTypeRef</name>
                <description/>
                <rangeTypeName>test.MyType</rangeTypeName>
        </featureDescription>
      </features>
    </typeDescription>
  </types>
</typeSystemDescription>
{code}

Here is my TOP extension generated by JCasGen:

{code:title=test/MyType.java}
/* First created by JCasGen Tue Jun 25 08:23:54 EDT 2013 */
package test;

import org.apache.uima.jcas.JCas; 
import org.apache.uima.jcas.JCasRegistry;
import org.apache.uima.jcas.cas.TOP_Type;

import org.apache.uima.jcas.cas.TOP;


/** 
 * Updated by JCasGen Tue Jun 25 08:23:54 EDT 2013
 * XML source: 
/home/mabarbor/interactive_wics_wksp/com.ibm.uima.matching/descriptors/MyTypes.xml
 * @generated */
public class MyType extends TOP {
  /** @generated
   * @ordered 
   */
  @SuppressWarnings ("hiding")
  public final static int typeIndexID = JCasRegistry.register(MyType.class);
  /** @generated
   * @ordered 
   */
  @SuppressWarnings ("hiding")
  public final static int type = typeIndexID;
  /** @generated  */
  @Override
  public              int getTypeIndexID() {return typeIndexID;}
 
  /** Never called.  Disable default constructor
   * @generated */
  protected MyType() {/* intentionally empty block */}
    
  /** Internal - constructor used by generator 
   * @generated */
  public MyType(int addr, TOP_Type type) {
    super(addr, type);
    readObject();
  }
  
  /** @generated */
  public MyType(JCas jcas) {
    super(jcas);
    readObject();   
  } 

  /** <!-- begin-user-doc -->
    * Write your own initialization here
    * <!-- end-user-doc -->
  @generated modifiable */
  private void readObject() {/*default - does nothing empty block */}
     
 
    
  //*--------------*
  //* Feature: myFeature

  /** getter for myFeature - gets 
   * @generated */
  public String getMyFeature() {
    if (MyType_Type.featOkTst && ((MyType_Type)jcasType).casFeat_myFeature == 
null)
      jcasType.jcas.throwFeatMissing("myFeature", "test.MyType");
    return jcasType.ll_cas.ll_getStringValue(addr, 
((MyType_Type)jcasType).casFeatCode_myFeature);}
    
  /** setter for myFeature - sets  
   * @generated */
  public void setMyFeature(String v) {
    if (MyType_Type.featOkTst && ((MyType_Type)jcasType).casFeat_myFeature == 
null)
      jcasType.jcas.throwFeatMissing("myFeature", "test.MyType");
    jcasType.ll_cas.ll_setStringValue(addr, 
((MyType_Type)jcasType).casFeatCode_myFeature, v);}    
   
    
  //*--------------*
  //* Feature: myTypeRef

  /** getter for myTypeRef - gets 
   * @generated */
  public MyType getMyTypeRef() {
    if (MyType_Type.featOkTst && ((MyType_Type)jcasType).casFeat_myTypeRef == 
null)
      jcasType.jcas.throwFeatMissing("myTypeRef", "test.MyType");
    return 
(MyType)(jcasType.ll_cas.ll_getFSForRef(jcasType.ll_cas.ll_getRefValue(addr, 
((MyType_Type)jcasType).casFeatCode_myTypeRef)));}
    
  /** setter for myTypeRef - sets  
   * @generated */
  public void setMyTypeRef(MyType v) {
    if (MyType_Type.featOkTst && ((MyType_Type)jcasType).casFeat_myTypeRef == 
null)
      jcasType.jcas.throwFeatMissing("myTypeRef", "test.MyType");
    jcasType.ll_cas.ll_setRefValue(addr, 
((MyType_Type)jcasType).casFeatCode_myTypeRef, 
jcasType.ll_cas.ll_getFSRef(v));}    
  }
{code}

And here is the TOP_Type extension created by JCasGen:

{code:title=test/MyType_Type.java}

/* First created by JCasGen Tue Jun 25 08:23:54 EDT 2013 */
package test;

import org.apache.uima.jcas.JCas;
import org.apache.uima.jcas.JCasRegistry;
import org.apache.uima.cas.impl.CASImpl;
import org.apache.uima.cas.impl.FSGenerator;
import org.apache.uima.cas.FeatureStructure;
import org.apache.uima.cas.impl.TypeImpl;
import org.apache.uima.cas.Type;
import org.apache.uima.cas.impl.FeatureImpl;
import org.apache.uima.cas.Feature;
import org.apache.uima.jcas.cas.TOP_Type;

/** 
 * Updated by JCasGen Tue Jun 25 08:23:54 EDT 2013
 * @generated */
public class MyType_Type extends TOP_Type {
  /** @generated */
  @Override
  protected FSGenerator getFSGenerator() {return fsGenerator;}
  /** @generated */
  private final FSGenerator fsGenerator = 
    new FSGenerator() {
      public FeatureStructure createFS(int addr, CASImpl cas) {
                         if (MyType_Type.this.useExistingInstance) {
                           // Return eq fs instance if already created
                     FeatureStructure fs = 
MyType_Type.this.jcas.getJfsFromCaddr(addr);
                     if (null == fs) {
                       fs = new MyType(addr, MyType_Type.this);
                           MyType_Type.this.jcas.putJfsFromCaddr(addr, fs);
                           return fs;
                     }
                     return fs;
        } else return new MyType(addr, MyType_Type.this);
          }
    };
  /** @generated */
  @SuppressWarnings ("hiding")
  public final static int typeIndexID = MyType.typeIndexID;
  /** @generated 
     @modifiable */
  @SuppressWarnings ("hiding")
  public final static boolean featOkTst = 
JCasRegistry.getFeatOkTst("test.MyType");
 
  /** @generated */
  final Feature casFeat_myFeature;
  /** @generated */
  final int     casFeatCode_myFeature;
  /** @generated */ 
  public String getMyFeature(int addr) {
        if (featOkTst && casFeat_myFeature == null)
      jcas.throwFeatMissing("myFeature", "test.MyType");
    return ll_cas.ll_getStringValue(addr, casFeatCode_myFeature);
  }
  /** @generated */    
  public void setMyFeature(int addr, String v) {
        if (featOkTst && casFeat_myFeature == null)
      jcas.throwFeatMissing("myFeature", "test.MyType");
    ll_cas.ll_setStringValue(addr, casFeatCode_myFeature, v);}
    
  
 
  /** @generated */
  final Feature casFeat_myTypeRef;
  /** @generated */
  final int     casFeatCode_myTypeRef;
  /** @generated */ 
  public int getMyTypeRef(int addr) {
        if (featOkTst && casFeat_myTypeRef == null)
      jcas.throwFeatMissing("myTypeRef", "test.MyType");
    return ll_cas.ll_getRefValue(addr, casFeatCode_myTypeRef);
  }
  /** @generated */    
  public void setMyTypeRef(int addr, int v) {
        if (featOkTst && casFeat_myTypeRef == null)
      jcas.throwFeatMissing("myTypeRef", "test.MyType");
    ll_cas.ll_setRefValue(addr, casFeatCode_myTypeRef, v);}
    
  



  /** initialize variables to correspond with Cas Type and Features
        * @generated */
  public MyType_Type(JCas jcas, Type casType) {
    super(jcas, casType);
    casImpl.getFSClassRegistry().addGeneratorForType((TypeImpl)this.casType, 
getFSGenerator());

 
    casFeat_myFeature = jcas.getRequiredFeatureDE(casType, "myFeature", 
"uima.cas.String", featOkTst);
    casFeatCode_myFeature  = (null == casFeat_myFeature) ? 
JCas.INVALID_FEATURE_CODE : ((FeatureImpl)casFeat_myFeature).getCode();

 
    casFeat_myTypeRef = jcas.getRequiredFeatureDE(casType, "myTypeRef", 
"test.MyType", featOkTst);
    casFeatCode_myTypeRef  = (null == casFeat_myTypeRef) ? 
JCas.INVALID_FEATURE_CODE : ((FeatureImpl)casFeat_myTypeRef).getCode();

  }
}
{code}





                
> Getting feature value from feature structure longer than expected
> -----------------------------------------------------------------
>
>                 Key: UIMA-3017
>                 URL: https://issues.apache.org/jira/browse/UIMA-3017
>             Project: UIMA
>          Issue Type: Improvement
>          Components: Core Java Framework
>    Affects Versions: 2.3
>         Environment: Linux x86_64
>            Reporter: Mike Barborak
>            Priority: Minor
>
> Should getting a value of a feature in a feature structure be fast? 
> Intuitively, I would expect performance to be about the same as getting an 
> entry from a Java HashMap or faster but in my experiments it seems to be 8 
> times slower. To solve my problem, I wrap my feature structures with caching 
> Java code but it seems that there might be an opportunity to speed up UIMA 
> generally.
> My test creates a CAS with a single feature structure in it. It sets a string 
> feature in that feature structure and then simply gets the value of that 
> feature in a tight loop. I compare that to an instance of a Java class that 
> has an internal HashMap of strings to strings. In that case, a method is 
> called on that instance to get an entry from the map in a very tight loop. 
> I do 5 rounds of each of the loops. The total times for the rounds involving 
> the CAS were:
> round 0 total time 1: 7.520104509s
> round 1 total time 1: 6.812214938s
> round 2 total time 1: 6.882752307s
> round 3 total time 1: 6.728515004s
> round 4 total time 1: 6.813674956s
> The total times for the rounds just using the Java class were:
> round 0 total time 2: 0.847296054s
> round 1 total time 2: 0.814570347s
> round 2 total time 2: 0.814399859s
> round 3 total time 2: 0.814189383s
> round 4 total time 2: 0.814979357s
> Here is my Java code:
> {code:title=MyTest.java}
> package test;
> import java.io.InputStream;
> import java.util.HashMap;
> import java.util.Map;
> import org.apache.uima.UIMAFramework;
> import org.apache.uima.cas.CAS;
> import org.apache.uima.cas.Feature;
> import org.apache.uima.cas.FeatureStructure;
> import org.apache.uima.cas.Type;
> import org.apache.uima.resource.metadata.TypeSystemDescription;
> import org.apache.uima.util.CasCreationUtils;
> import org.apache.uima.util.XMLInputSource;
> public class MyTest {
>   
>   static class MyClass {
>     Map<String, String> myFeatures = new HashMap<String, String>();
>     
>     void setStringValue(String feature, String value) {
>       myFeatures.put(feature, value);
>     }
>     
>     String getStringValue(String feature) {
>       return myFeatures.get(feature);
>     }
>   }
>   
>   static public void main(String[] argv) throws Exception {
>     InputStream stream = 
> TestSupport.class.getClassLoader().getResourceAsStream("MyTypes.xml");
>     TypeSystemDescription typeSystemDescription = 
> UIMAFramework.getXMLParser().parseTypeSystemDescription(new 
> XMLInputSource(stream, null));
>     CAS cas = CasCreationUtils.createCas(typeSystemDescription, null, null);
>     Type myType = cas.getTypeSystem().getType("MyType");
>     FeatureStructure fs = cas.createFS(myType);
>     Feature myFeature = myType.getFeatureByBaseName("myFeature");
>     fs.setStringValue(myFeature, "myString");
>     cas.addFsToIndexes(fs);
>     
>     MyClass myInstance = new MyClass();
>     myInstance.setStringValue("myFeature2", "myString2");
>     
>     long iterations = 100000000;
>     double nanoSecsPerSec = 1000000000.0d;
>     
>     for (int round = 0; round < 5; round++) {
>       long start = System.nanoTime();
>       for (long i = 0; i < iterations; i++) {
>         fs.getStringValue(myFeature);
>       }
>       long end = System.nanoTime();
>       System.out.println("round " + round + " total time 1: " + ((end - 
> start) / nanoSecsPerSec) + "s");
>     }
>       
>     for (int round = 0; round < 5; round++) {
>       long start = System.nanoTime();
>       for (long i = 0; i < iterations; i++) {
>         myInstance.getStringValue("myFeature2");
>       }
>       long end = System.nanoTime();
>       System.out.println("round " + round + " total time 2: " + ((end - 
> start) / nanoSecsPerSec) + "s");
>     }
>   }
> }
> {code}
> Here is my type descriptor:
> {code:xml}
> <?xml version="1.0" encoding="UTF-8"?>
> <typeSystemDescription xmlns="http://uima.apache.org/resourceSpecifier";>
>   <name>MyTypes</name>
>   <description/>
>   <version>1.0</version>
>   <vendor/>
>   <types>
>     <typeDescription>
>       <name>MyType</name>
>       <description/>
>       <supertypeName>uima.cas.TOP</supertypeName>
>       <features>
>         <featureDescription>
>           <name>myFeature</name>
>           <description></description>
>           <rangeTypeName>uima.cas.String</rangeTypeName>
>         </featureDescription>
>       </features>
>     </typeDescription>
>   </types>
> </typeSystemDescription>
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to