Hi Shan,

On Tue, 18 Oct 2005, shan wrote:

> Hi,
>   I am trying the x86 module of Flexus. It works perfectly with single
> thread application, but when I try some multithreaded program (just a very
> simple toy application with pthread_create and pthread_join). There is some
> error. The error information is like:
>
> ....
> 31 <flexus.cpp:240> {1835008}- Timestamp: 2005-Oct-18 09:57:25
> [cpu0] <address not in TLB>
> simics> c
> [cpu0] <address not in TLB>
> simics>

The address not in TLB exception is an exception that Simics raises
anytime some piece of code (i.e., Flexus) asks it to translate a logical
address to a physical address, but the translation is not available in the
TLB of the current CPU.  This situation is very unusual - it probably only
arises while the OS is manipulating TLB entries or is about to take a page
fault on an instruction reference or something similar.  However,
functions like pthread_create are more likely to create this situation.

The fact that this is causing your simulation to stop is a bug in Flexus.
Flexus uses the SIM_logical_to_physical call to get the physical PC
of instructions in a few special cases.  This call can fail if the
translation for the PC is not available in the TLB (which implies the CPU
is about to take an ITLB fault).  However, I forgot to include an error
check after the calls to SIM_logical_to_physical, so the exception ends up
propagating back to the Simics frontend, and stops the simulation.

The fix is to add an error check after the SIM_logical_to_physical calls
in components/InorderSimicsFeeder/SimicsTracer.hpp.  I have attached a
fixed version of the file to this email.  Note that I did not test it
(except to check that it compiles), as I do not have any x86 test images
handy.  We don't use x86 extensively here, which is why this bug has gone
unnoticed.

Please let me know if you continue to have problems. If so, I will help
you add a bunch of debugging messages so we can confirm if the problem is
actually what I think it is.

Regards,
-Tom Wenisch
Computer Architecture Lab
Carnegie Mellon University

>
>   It happens after the program simulated a little while, maybe right at the
> time when the thread is created.
>   I guess I miss some multi-thread related flag... what should I do to
> simulate a multithreaded application?
>   Oh, my toy application can run on simics without flexus module loaded.
> Thanks
> Shan
>
>
-------------- next part --------------
// DO-NOT-REMOVE begin-copyright-block
//
// Redistributions of any form whatsoever must retain and/or include the
// following acknowledgment, notices and disclaimer:
//
// This product includes software developed by Carnegie Mellon University.
//
// Copyright \(c\) 2005 by Brian Gold, Nikos Hardavellas, Jangwoo Kim,
// Jared Smolens, Stephen Somogyi, Tom Wenisch, Babak Falsafi and
// James C. Hoe for the SimFlex Project, Computer Architecture Lab
// at Carnegie Mellon, Carnegie Mellon University.
//
// For more information, see the SimFlex project website at:
//   http://www.ece.cmu.edu/~simflex
//
// You may not use the name Carnegie Mellon University or derivations
// thereof to endorse or promote products derived from this software.
//
// If you modify the software you must place a notice on or within any
// modified version provided or made available to any third party stating
// that you have modified the software.  The notice shall include at least
// your name, address, phone number, email address and the date and purpose
// of the modification.
//
// THE SOFTWARE IS PROVIDED AS-IS WITHOUT ANY WARRANTY OF ANY KIND, EITHER
// EXPRESS, IMPLIED OR STATUTORY, INCLUDING BUT NOT LIMITED TO ANY WARRANTY
// THAT THE SOFTWARE WILL CONFORM TO SPECIFICATIONS OR BE ERROR-FREE AND ANY
// IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE,
// TITLE, OR NON-INFRINGEMENT.  IN NO EVENT SHALL CARNEGIE MELLON UNIVERSITY
// BE LIABLE FOR ANY DAMAGES, INCLUDING BUT NOT LIMITED TO DIRECT, INDIRECT,
// SPECIAL OR CONSEQUENTIAL DAMAGES, ARISING OUT OF, RESULTING FROM, OR IN
// ANY WAY CONNECTED WITH THIS SOFTWARE (WHETHER OR NOT BASED UPON WARRANTY,
// CONTRACT, TORT OR OTHERWISE).
//
// DO-NOT-REMOVE end-copyright-block
/*
  V9 Memory Op
*/




namespace nInorderSimicsFeeder {

std::set<Flexus::Simics::API::conf_object_t *> theTimingModels;

class SimicsTracerImpl  {
  private:
    conf_object_t * theUnderlyingObject;
    conf_object_t * theCPU;
    index_t theIndex;
    boost::shared_ptr<SimicsTraceConsumer> theConsumer;
    SimicsCycleManager * theCycleManager; //Non-owning pointer
    StoreBuffer theStoreBuffer;
    bool theInterruptsEnabled;
    bool theIgnoreFlag;
    conf_object_t * thePhysMemory;
    conf_object_t * thePhysIO;
    physical_address_t current_pc;

  public:
    //Default Constructor, creates a SimicsTracer that is not connected
    //to an underlying Simics object.
    SimicsTracerImpl(conf_object_t * anUnderlyingObject)
      : theUnderlyingObject(anUnderlyingObject)
      , theConsumer()
      , theInterruptsEnabled(true)
      , theIgnoreFlag(false)
    {}

    // Initialize the tracer to the desired CPU
    void init(conf_object_t * aCPU, index_t anIndex) {
        theCPU = aCPU;
        theIndex = anIndex;

        attr_value_t attr;

        theIndex = anIndex;


        if (SIM_class_has_attribute(theCPU->class_data, 
"turbo_execution_mode")) {
          DBG_Assert( false , ( << "Simics appears to be running with the -fast 
option.  You must launch Simics with -stall to run in-order simulations") );
        }

        attr = SIM_get_attribute(theCPU, "ooo_mode");
        if ( attr.kind != Sim_Val_String || std::string(attr.u.string) != 
"in-order" ) {
          DBG_Assert( false , ( << "Simics appears to be running with the -ma 
option.  You must launch Simics with -stall to run in-order simulations") );
        }

        attr.kind = Sim_Val_Object;
        attr.u.object = theCPU;

        SIM_set_attribute(theUnderlyingObject, "queue", &attr);


        attr.kind = Sim_Val_Object;
        attr.u.object = theUnderlyingObject;

        /* Tell memory we have a mem hier */
        thePhysMemory = SIM_get_attribute(theCPU, "physical_memory").u.object;
        SIM_set_attribute(thePhysMemory, "timing_model", &attr);
        if (theTimingModels.count(thePhysMemory) > 0) {
          DBG_Assert( false, ( << "Two CPUs connected to the same memory 
timing_model: " << thePhysMemory->name) );
        }
        theTimingModels.insert(thePhysMemory);


        #if FLEXUS_TARGET_IS(v9)
          //We only use the snoop interface for v9
          /* Tell memory we have a mem hier */
          SIM_set_attribute(thePhysMemory, "snoop_device", &attr);


          /* Tell memory we have a mem hier */
          thePhysIO = SIM_get_attribute(theCPU, "physical_io").u.object;
          SIM_set_attribute(thePhysIO, "timing_model", &attr);
        #endif //FLEXUS_TARGET_IS(v9)


        if (theTimingModels.count(thePhysIO) > 0) {
          DBG_Assert( false, ( << "Two CPUs connected to the same I/O 
timing_model: " << thePhysIO->name) );
        }
        theTimingModels.insert(thePhysIO);

        current_pc = 0;

    }

    void setTraceConsumer(boost::shared_ptr<SimicsTraceConsumer> aConsumer) {
        theConsumer = aConsumer;
        theConsumer->init(theIndex);
    }

    void setCycleManager(SimicsCycleManager & aManager) {
        theCycleManager = &aManager;
    }


    void setIgnore() {
      DBG_(VVerb, SetNumeric( (FlexusIdx) theIndex) ( << "set ignore" ) );
      theIgnoreFlag = true;
    }

    void clearIgnore() {
      DBG_(VVerb, SetNumeric( (FlexusIdx) theIndex) ( << "clear ignore" ) );
      theIgnoreFlag = false;
    }

    bool ignore() const {
      return theIgnoreFlag;
    }

    void enableInterrupts() {
      if (! theInterruptsEnabled) {
        theInterruptsEnabled = true;
        attr_value_t attr;
        attr.kind = Sim_Val_Integer;
        #if FLEXUS_TARGET_IS(v9)
          attr.u.integer = 1;
          SIM_set_attribute(theCPU, "extra_irq_enable", &attr);
        #elif FLEXUS_TARGET_IS(x86)
          attr.u.integer = 0;
          SIM_set_attribute(theCPU, "temporary_interrupt_mask", &attr);
        #endif //FLEXUS_TARGET_IS(v9)
      }
    }

    void disableInterrupts() {
      if (theInterruptsEnabled) {
        theInterruptsEnabled = false;
        attr_value_t attr;
        attr.kind = Sim_Val_Integer;
        #if FLEXUS_TARGET_IS(v9)
          attr.u.integer = 0;
          SIM_set_attribute(theCPU, "extra_irq_enable", &attr);
        #elif FLEXUS_TARGET_IS(x86)
          attr.u.integer = 1;
          SIM_set_attribute(theCPU, "temporary_interrupt_mask", &attr);
        #endif //FLEXUS_TARGET_IS(v9)
      }
    }

    bool interruptsEnabled() const {
      return theInterruptsEnabled;
    }


    DoubleWord readStoreBuffer(PhysicalMemoryAddress const & anAlignedAddress) {
      StoreBuffer::iterator iter = theStoreBuffer.find(anAlignedAddress);
      if (iter != theStoreBuffer.end()) {
        //DBG_(Tmp, ( << "Store buffer access hit for address: " << &std::hex 
<< anAlignedAddress << &std::dec << " returning value: " << 
iter->second.theNewValue ));
        return iter->second.theNewValue;
      }
      return DoubleWord();
    }

    //Returns the current contents of the store buffer for a memory location
    DoubleWord getMemoryValue(PhysicalMemoryAddress & anAlignedAddress) {
      unsigned long long memory( Simics::API::SIM_read_phys_memory(theCPU, 
anAlignedAddress, 8) );
      DoubleWord store_buffer( readStoreBuffer(anAlignedAddress) );
      DoubleWord ret_val(memory, store_buffer);
      DBG_(VVerb, ( << "Applying SB: " << store_buffer << " to " << & std::hex 
<< memory << & std::dec << " results in " << ret_val << &std::dec  ));
      return ret_val;
    }


    cycles_t trace_mem_hier_operate(conf_object_t *space, map_list_t *map, 
generic_transaction_t *aMemTrans) {
      memory_transaction_t * mem_trans = reinterpret_cast<memory_transaction_t 
*>(aMemTrans);
      #ifdef FLEXUS_FEEDER_TRACE_DEBUGGER
        theConsumer->debugger.nextCallback(mem_trans);
      #endif //FLEXUS_FEEDER_TRACE_DEBUGGER

      int stall = real_hier_operate(space, map, mem_trans);

      #ifdef FLEXUS_FEEDER_TRACE_DEBUGGER
        theConsumer->debugger.ret(stall);

        // verify we aren't trying to stall when it is not permitted
        if( (!mem_trans->s.may_stall) && (stall > 0) ) {
          DBG_(Crit, SetNumeric( (FlexusIdx) theIndex )
                   ( << "stalling " << stall << " cycles when not permitted" ) 
);
          theConsumer->debugger.dump();
        }
      #endif //FLEXUS_FEEDER_TRACE_DEBUGGER

      return stall;
    }


    //Useful debugging stuff for tracing every instruction
    void debugTransaction(memory_transaction_t *mem_trans) {
        logical_address_t pc_logical = SIM_get_program_counter(theCPU);
        physical_address_t pc = SIM_logical_to_physical(theCPU, 
Sim_DI_Instruction, pc_logical);
        if (SIM_clear_exception() != SimExc_No_Exception) {
          DBG_(Tmp, SetNumeric( (FlexusIdx) theIndex)
                  ( << "Instruction results in ITLB Miss pc_logical: " << 
pc_logical)
            );
           return;
        }
        tuple_int_string_t * retval = SIM_disassemble(theCPU, pc , 0);
        (void)retval; //suppress unused variable warning

        DBG_(Tmp, SetNumeric( (FlexusIdx) theIndex)
                  ( << "Mem Heir Instr: " << retval->string << " pc: " << pc)
            );

      if (SIM_mem_op_is_data(&mem_trans->s)) {
        if (SIM_mem_op_is_write(&mem_trans->s)) {
          DBG_(Tmp, SetNumeric( (FlexusIdx) theIndex)
                    ( << "  Write @" << &std::hex << 
mem_trans->s.physical_address << &std::dec << '[' << mem_trans->s.size << ']'
                      << " type=" << mem_trans->s.type
                      << (  mem_trans->s.atomic ? " atomic" : "" )
                      << (  mem_trans->s.may_stall ? "" : " no-stall" )
                      << (  mem_trans->s.inquiry ? " inquiry" : "")
                      << (  mem_trans->s.speculative ? " speculative" : "")
                      << (  mem_trans->s.ignore ? " ignore" : "")
                    )
            );
        } else {
          if ( mem_trans->s.type == Simics::API::Sim_Trans_Prefetch) {
              DBG_(Tmp, SetNumeric( (FlexusIdx) theIndex)
                        ( << "  Prefetch @" << &std::hex << 
mem_trans->s.physical_address << &std::dec << '[' << mem_trans->s.size  << ']'
                          << " type=" << mem_trans->s.type
                          << (  mem_trans->s.atomic ? " atomic" : "" )
                          << (  mem_trans->s.may_stall ? "" : " no-stall" )
                          << (  mem_trans->s.inquiry ? " inquiry" : "")
                          << (  mem_trans->s.speculative ? " speculative" : "")
                          << (  mem_trans->s.ignore ? " ignore" : "")
                        )
                );
          } else {
              DBG_(Tmp, SetNumeric( (FlexusIdx) theIndex)
                        ( << "  Read @" << &std::hex << 
mem_trans->s.physical_address << &std::dec << '[' << mem_trans->s.size  << ']'
                          << " type=" << mem_trans->s.type
                          << (  mem_trans->s.atomic ? " atomic" : "" )
                          << (  mem_trans->s.may_stall ? "" : " no-stall" )
                          << (  mem_trans->s.inquiry ? " inquiry" : "")
                          << (  mem_trans->s.speculative ? " speculative" : "")
                          << (  mem_trans->s.ignore ? " ignore" : "")
                        )
                );
          }
        }
      }
    }

    bool requiresSync(memory_transaction_t *mem_trans) {
      if ( mem_trans->s.size > 8) {
        //Anything larger than 8 bytes must sync
        return true;
      }

      #if FLEXUS_TARGET_IS(v9)
        //If we are using a stange ASI, mark this as a sync
        switch ( mem_trans->address_space ) {
          //Privileged
            case 0x04: //NUCLEUS
            case 0x0C: //NUCLEUS_LITTLE
            case 0x10: //AS_IF_USER_PRIMARY
            case 0x11: //AS_IF_USER_SECONDARY
            case 0x18: //AS_IF_USER_PRIMARY_LITTLE
            case 0x19: //AS_IF_USER_SECONDARY_LITTLE
            case 0x24: //NUCLEUS_QUAD_LDD
            case 0x2C: //NUCLEUS_QUAD_LDD_LITTLE
            //User
            case 0x81: //SECONDARY
            case 0x88: //PRIMARY_LITTLE
            case 0x89: //SECONDARY_LITTLE
            case 0x80: //PRIMARY
              break;
          default:
            //Any other ASI requires a Sync
            DBG_(Iface, SetNumeric( (FlexusIdx) theIndex)
                 (    << "Alternate ASI " << mem_trans->address_space <<" @" << 
&std::hex
                      << mem_trans->s.physical_address
                      << '[' << &std::dec << mem_trans->s.size << ']'
                 )
            );

            return true;
        }

        //Non-cacheable operations require a sync
        if ( (! mem_trans->cache_virtual) || ( ! mem_trans->cache_physical) ) {
          DBG_(Iface, SetNumeric( (FlexusIdx) theIndex)
               (    << "Non-cacheable @" << &std::hex
                    << mem_trans->s.physical_address
                    << '[' << &std::dec << mem_trans->s.size << ']'
               )
          );
          return true;
        }
      #endif //FLEXUS_TARGET_IS(v9)

      return false;
    }

    cycles_t real_hier_operate(conf_object_t *space, map_list_t *map, 
memory_transaction_t *mem_trans) {
    //NOTE: Multiple return paths
      const int k_no_stall = 0;
      const int k_call_me_next_cycle = 1;



      //debugTransaction(mem_trans);

      //Ensure that we see future accesses to this block
      mem_trans->s.block_STC = 1;

      //Case 0: This is an IO access
      //============================

        //All IO accesses must stall until SB is empty
        if (space == thePhysIO) {
          if ( ! theStoreBuffer.empty() ) {
            DBG_(VVerb, SetNumeric( (FlexusIdx) theIndex ) ( << "  Got an IO 
access while the Store buffer contains data.  Must stall access till the store 
buffer flushes." ) );

            // JWK:
            // Previous IO (store) instruction should be marked as IO 
instruction here...
            // Execute doesn't release it until SB becomes empty
            // This instruction must be the last one in theEntries....
            theConsumer->theEntries.back().theReadyInstruction->setIO();

            theCycleManager->advanceFlexus();
            cycles_t stall_cycles = 
theCycleManager->reconcileTime(theConsumer->queueSize());

            if (stall_cycles == 0) {
              disableInterrupts();
              return k_call_me_next_cycle;
            } else {
              disableInterrupts();
              return stall_cycles;
            }
          } else {
            DBG_(VVerb, SetNumeric( (FlexusIdx) theIndex ) ( << "  Got an IO 
access but store buffer is empty.  Access may proceed." ) );
            //Otherwise, we allow the access to complete.
            return k_no_stall;
          }

        }


      //Case 1: Previously pending operation has completed since last call
      //==================================================================
          //See if perviously pending instructions have been completed
          if(theConsumer->isComplete()) {
            // We have completed the instruction that was previously stalled 
since
            // the last time we were called by Simics

            DBG_(VVerb, SetNumeric( (FlexusIdx) theIndex ) ( << "Case 1: 
completed operation" ) );

            if ( mem_trans->s.type == Sim_Trans_Instr_Fetch ) {
                DBG_Assert( current_pc == mem_trans->s.physical_address ) ;
                DBG_(VVerb, SetNumeric( (FlexusIdx) theIndex ) ( << "Ignoring 
re-issue of fetch operation." ) );
                return 0;
            }

            // Indicate that Simics is aware the instruction is complete
            theConsumer->simicsDone();

            //If we have placed this store instruction in the store buffer, then
            //we should prevent simics from performing the store operation now
            if (ignore()) {
              //Clear the ignore flag
              clearIgnore();

              //DBG_Assert( ( mem_trans->s.type == Sim_Trans_Store ) /*|| ( 
mem_trans->s.type == Sim_Trans_Load )*/);

              if ( ( mem_trans->s.type == Sim_Trans_Store ) ) {

                //Tell Simics not to perform this store operation
                mem_trans->s.ignore = 1;
                DBG_( VVerb, ( << "Ignoring this operation" ) );
              } else {
                DBG_( Crit, SetNumeric( (FlexusIdx) theIndex ) ( << "Expected 
to ignore a store completion, but got something that isn't a store.  
Transaction follows: " ) );
                debugTransaction(mem_trans);
              }
            }

            //We are completing the current instruction, so interrupts are ok
            enableInterrupts();

            // return zero cycle latency
            return k_no_stall;
          }


      //All the code below needs to distinguish fetches from data operations
        bool is_fetch = ( mem_trans->s.type == Sim_Trans_Instr_Fetch);
        bool is_data = ( mem_trans->s.type == Sim_Trans_Load ) || ( 
mem_trans->s.type == Sim_Trans_Store ) || ( mem_trans->s.type == 
Sim_Trans_Prefetch);

      //Case 2: We have a pending data operation that has not been completed
      //====================================================================
      // See if we have a pending operation.
        if (! theConsumer->isIdle()) {
          //There is some operation pending for this cpu, and it has not yet
          //been completed by flexus.

          DBG_(VVerb, SetNumeric( (FlexusIdx) theIndex ) ( << "Case 2: 
incomplete pending op" ) );

          //First, some assertions to make sure Simics has given us back the
          //same transaction as we were stalled on previously.

          if ( is_fetch ) {
            DBG_Assert( current_pc == 0 || current_pc == 
mem_trans->s.physical_address, ( << "CPU" << theIndex << " current_pc=" << 
current_pc << " mem_trans->s.physical_address=" << 
mem_trans->s.physical_address)) ;
            DBG_(VVerb, SetNumeric( (FlexusIdx) theIndex ) ( << "Ignoring 
re-issue of instruction fetch" ) );
            return 0;
          }
          else if ( is_data ) {
            //Ensure that this data operation is going t
            if( PhysicalMemoryAddress(mem_trans->s.physical_address) !=
                theConsumer->optimizedGetInstruction().physicalMemoryAddress() 
) {
              DBG_( Crit, SetNumeric( (FlexusIdx) theIndex )
                          ( << "data addresses don't match for reissued Simics 
request. Expected Instr: "
                            << theConsumer->optimizedGetInstruction()
                            << " Received: "
                            << mem_trans->s.physical_address
                          ) );
              #ifdef FLEXUS_FEEDER_TRACE_DEBUGGER
                theConsumer->debugger.dump();
              #endif //FLEXUS_FEEDER_TRACE_DEBUGGER
            }
          }
          else {
            DBG_(Crit, SetNumeric( (FlexusIdx) theIndex )
                       ( << "while pending, got a new transaction that is 
neither inst nor data" ) );
          }

          //Interrupts should be disabled
          DBG_Assert( ! interruptsEnabled() ) ;

          #ifndef FLEXUS_FEEDER_OLD_SCHEDULING

            // Try to advance Flexus
            theCycleManager->advanceFlexus();
            cycles_t stall_cycles = 
theCycleManager->reconcileTime(theConsumer->queueSize());

            if (stall_cycles > 0) {
              //All pending operations have not yet completed.
              //Interrupts remain disabled.  We will be called again.
              return stall_cycles;
            } else {
              //We can not advance flexus further.  This is either because
              //this cpu has completed all its pending operations, or because
              //some other cpu has run out of instructions.

              if(theConsumer->isComplete()) {
                //This cpu has completed all its operations.

                //Check if we should prevent Simics from executing the
                //operation
                if (ignore()) {
                  //We should.  This should only be the case for store
                  //instructions
                  DBG_Assert( ( mem_trans->s.type == Sim_Trans_Store )    );

                  //Inform Simics to ignore the store
                  mem_trans->s.ignore = 1;

                  //Clear the store pending flag
                  clearIgnore();
                  DBG_( VVerb, ( << "Ignoring this store" ) );
                }

                //Indicate to the consumer that we have notified simics that
                //the instruction is complete
                theConsumer->simicsDone();

                enableInterrupts();
                return k_no_stall;
              } else {
                //Some cpu has run out of instructions, but this cpu is still
                //pending. Interrupts remain disabled.  We will be called again.
                return k_call_me_next_cycle;  //Call back next cycle
              }
            }

          #else //FLEXUS_FEEDER_OLD_SCHEDULING
            return k_call_me_next_cycle;  //Call back next cycle
          #endif //FLEXUS_FEEDER_OLD_SCHEDULING

        }

      //Case 3: We have a new instruction fetch
      //=======================================
        if ( is_fetch ) {
          DBG_(VVerb, SetNumeric( (FlexusIdx) theIndex ) ( << "Case 3: new 
instruction fetch" ) );

          if (theFlexus->quiescing() ) {
            //We halt new instruction fetches when we are trying to quiesce 
Flexus
            return k_call_me_next_cycle;
          }

          // Instruction fetches should never be atomic memory operations
          DBG_Assert(!mem_trans->s.atomic,SetNumeric( (FlexusIdx) theIndex) );

          //If we have many instructions queued, we will advance Flexus
          //to drain the queue.  This may result in stall cycles.  However,
          //by default, we return without stall

          cycles_t ret = k_no_stall;

          #ifndef FLEXUS_FEEDER_OLD_SCHEDULING
            if(theConsumer->largeQueue()) {
              //theConsumer has many instructions queued.  We advance.

              DBG_(VVerb, SetNumeric( (FlexusIdx) theIndex ) ( << "advancing 
large queue" ) );
              theCycleManager->advanceFlexus();

              //We give reconcileTime the number of pending instructions plus
              //1 for the instruction we have not yet queued.
              ret = theCycleManager->reconcileTime(theConsumer->queueSize() + 
1); //Take the instruction that we have not yet queued into account

            }
          #endif

          // create a new instruction object
          intrusive_ptr<ArchitecturalInstruction> new_inst( new 
ArchitecturalInstruction(theConsumer.get()) ) ;

          // set the PC
          new_inst->setVirtInstAddress( 
VirtualMemoryAddress(mem_trans->s.logical_address) );
          new_inst->setPhysInstAddress( 
PhysicalMemoryAddress(mem_trans->s.physical_address) );

          current_pc = mem_trans->s.physical_address;

          #if FLEXUS_TARGET_IS(v9)
            //See if the instruction is a MEMBAR
            unsigned long op_code = SIM_read_phys_memory(theCPU, 
mem_trans->s.physical_address, 4);
            new_inst->setOpcode(op_code);


            //MEMBAR  is 1000 0001 0100 0011 11-- ---- ---- ----
            const unsigned long kMEMBAR_mask = 0xFFFFC000;
            const unsigned long kMEMBAR_pattern = 0x8143C000;
            if ((op_code & kMEMBAR_mask) == kMEMBAR_pattern) {
              //It is a MEMBAR. Figure out which MEMBAR it is.  We care about:
                //MEMBAR #sync            8143E0040
                //MEMBAR #memissue        8143E0020
                //MEMBAR #lookaside       8143E0010
                //MEMBAR #storeload       8143E0002
              const unsigned long kMEMBAR_SYNC_mask = 0x72;

              if ( (op_code & kMEMBAR_SYNC_mask) != 0) {
                DBG_(VVerb, SetNumeric( (FlexusIdx) theIndex ) ( << "  
Instruction is a syncing MEMBAR" ) );
                new_inst->setIsMEMBAR();
              }

            }
          #else
            //Reading op-codes not supported on x86
            new_inst->setOpcode(0);
          #endif //FLEXUS_TARGET_IS(v9)

          //Pass the newly created instruction to the consumer
          theConsumer->consumeInstOperation(new_inst);

          //If we did not stall while advancing, or did not adance
          if (ret == 0) {
            enableInterrupts();
          } else {
            disableInterrupts();
          }
          return ret;

        }

      //Case 4: We have a data operation to attach to the previously fetched 
instruction
      
//================================================================================
        if ( is_data ) {
          DBG_(VVerb, SetNumeric( (FlexusIdx) theIndex ) ( << "Case 4: Data 
operation" ) );


          bool is_write = ( mem_trans->s.type == Sim_Trans_Store );
          bool is_atomic = mem_trans->s.atomic;

          //Case 4.1: Atomic memory operation
          //=================================

            // for atomic operations, if this is the first half (the read), just
            // append the data reference onto the most recent instruction; if 
this
            // is the second half, verify that it matches with the first half
            if( is_atomic ) {

              DBG_(VVerb, SetNumeric( (FlexusIdx) theIndex ) ( << "Case 4.1: 
Atomic operation" ) );

              //Obtain the physical PC of this operation so we can verify
              //that it matches the first half of the atomic operation
              logical_address_t pc_logical = SIM_get_program_counter(theCPU);
              physical_address_t pc = SIM_logical_to_physical(theCPU, 
Sim_DI_Instruction, pc_logical);
              if (SIM_clear_exception() != SimExc_No_Exception) {
                pc = 0;
              }

              //See if this is the first or second half of the operation.  It
              //is the second half if theConsumer's queue is empty, since this
              //means we just got a second back-to-back data operation, which
              //only occurs in the case of the second mem ops
              if(theConsumer->theEntries.empty() && 
theConsumer->isAtomicOperationPending()) {
                DBG_(VVerb, SetNumeric( (FlexusIdx) theIndex ) ( << "  second 
half of atomic operation" ) );

                //Verify that the second half of the atomic operation matches
                //the first half that we just completed
                theConsumer->verifyAtomicOperation(is_write, 
PhysicalMemoryAddress(pc), PhysicalMemoryAddress(mem_trans->s.physical_address 
) );

                DBG_(VVerb, SetNumeric( (FlexusIdx) theIndex )
                           ( << "  verified atomic operation" ) );

                enableInterrupts();
                return k_no_stall;
              } else {
                //First half of an atomic operation
                DBG_(VVerb, SetNumeric( (FlexusIdx) theIndex ) ( << "  first 
half of atomic operation" ) );

                //Better not be a write
                DBG_Assert(!is_write);

                // Obtain a reference to the previous fetch so we can add the 
data
                // operation to it.
                ArchitecturalInstruction & inst = 
theConsumer->optimizedGetInstruction();

                //Ensure that we are not overwriting another instruction
                DBG_Assert(inst.isNOP());

                //Fill in the physical address for this memory operation
                inst.setAddress( 
PhysicalMemoryAddress(mem_trans->s.physical_address) );

                #if FLEXUS_TARGET_IS(v9)
                  // record the opcode
                  unsigned long op_code = SIM_read_phys_memory(theCPU, pc, 4);
                  inst.setOpcode(op_code);

                  //LDD(a)            is 11-- ---0 -001 1--- ---- ---- ---- ----
                  //STD(a)            is 11-- ---0 -011 1--- ---- ---- ---- ----

                  //LDSTUB(a)/SWAP(a) is 11-- ---0 -11- 1--- ---- ---- ---- ----
                  //CAS(x)A           is 11-- ---1 111- 0--- ---- ---- ---- ----

                  const unsigned long kLDD_mask = 0xC1780000;
                  const unsigned long kLDD_pattern = 0xC0180000;

                  const unsigned long kSTD_mask = 0xC1780000;
                  const unsigned long kSTD_pattern = 0xC038000;

                  const unsigned long kRMW_mask = 0xC1680000;
                  const unsigned long kRMW_pattern = 0xC0680000;

                  const unsigned long kCAS_mask = 0xC1E80000;
                  const unsigned long kCAS_pattern = 0xC1E00000;

                  if ((op_code & kLDD_mask) == kLDD_pattern) {
                    inst.setIsLoad();
                  } else if ((op_code & kSTD_mask) == kSTD_pattern) {
                    inst.setIsStore();
                    inst.setSync();
                  } else if (((op_code & kRMW_mask) == kRMW_pattern) || 
((op_code & kCAS_mask) == kCAS_pattern))  {
                    inst.setIsRmw();
                  } else {
                    DBG_Assert( false, ( << "Unknown atomic operation. Opcode: 
" << std::hex << op_code << " pc: " << pc << std::dec ) );
                  }
                #else
                  //Assume all atomic x86 operations are RMWs.  This may not
                  //be true
                  inst.setIsRmw();
                #endif //v9

                theConsumer->recordAtomicVerification( 
PhysicalMemoryAddress(pc), PhysicalMemoryAddress(mem_trans->s.physical_address 
));

                //Remember the size of the rmw
                inst.setSize(mem_trans->s.size);

                #if FLEXUS_TARGET_IS(v9)
                  if (mem_trans->priv) {
                    //Mark privileged operations
                    inst.setPriv();
                  }
                #elif FLEXUS_TARGET_IS(x86)
                  if (mem_trans->mode == Sim_CPU_Mode_Supervisor) {
                    //Mark privileged operations
                    inst.setPriv();
                  }
                #endif //FLEXUS_TARGET_IS(v9)


                // add this data instruction to the consumer
                theConsumer->consumeDataOperation();


              }
          //Case 4.2: Load or store operation
          //=================================
            } else {

              DBG_(VVerb, SetNumeric( (FlexusIdx) theIndex ) ( << "Case 4.2: 
Load or Store operation" ) );

              if ( theConsumer->isEmpty() ) {

                DBG_(VVerb, SetNumeric( (FlexusIdx) theIndex ) ( << "Case 4.2a: 
non-stallable LDD or STD" ) );

                //Non-stallable LDD and STD instructions are the only cases 
where
                //we can have a memory operation while theConsumer is empty
/*
                if ( ! ( ! mem_trans->s.may_stall ) ) { //was an assert
                  DBG_( Crit, SetNumeric( (FlexusIdx) theIndex ) ( << "Expected 
a non-stallable LDD or STD, but operation is stallable.  Transaction follows: " 
) );
                  debugTransaction(mem_trans);
                }
*/

                //We simulate a fetch operation here.
                //Need to get a PC.
                logical_address_t pc_logical = SIM_get_program_counter(theCPU);
                physical_address_t pc = SIM_logical_to_physical(theCPU, 
Sim_DI_Instruction, pc_logical);
                if (SIM_clear_exception() != SimExc_No_Exception) {
                  pc = 0;
                }

                // create a new instruction object
                intrusive_ptr<ArchitecturalInstruction> new_inst( new 
ArchitecturalInstruction( theConsumer.get() ) ) ;

                // set the PC
                new_inst->setPhysInstAddress( PhysicalMemoryAddress( pc ) );

                //Pass the newly created instruction to the consumer
                theConsumer->consumeInstOperation(new_inst);

              }


              // Obtain a reference to the previous fetch so we can add the data
              // operation to it.
              ArchitecturalInstruction & inst = 
theConsumer->optimizedGetInstruction();

              //Fill in the physical address for this memory operation
              inst.setAddress( 
PhysicalMemoryAddress(mem_trans->s.physical_address) );

              if (is_write) {
                //Indicate that its a store
                inst.setIsStore();
              } else {
                inst.setIsLoad();
              }

              //See if it meets any of the conditions which require a sync
              if ( requiresSync(mem_trans) ) {
                DBG_(VVerb, SetNumeric( (FlexusIdx) theIndex ) ( << "  store 
requires sync" ) );
                inst.setSync();
              }

              //See if this is a store operation which may use the store buffer
              if (is_write && !inst.isSync() && FLEXUS_TARGET_IS(v9) ) {

                DBG_(VVerb, SetNumeric( (FlexusIdx) theIndex ) ( << "Case 4.2b: 
Store which uses Store Buffer" ) );

                //This store does not require a Sync, so it may use the store
                //buffer

                //Obtain the double-word-aligned address
                PhysicalMemoryAddress 
aligned_addr(mem_trans->s.physical_address & ~7LL);

                //Construct the new word-aligned value
                DoubleWord new_value;
                new_value.set( SIM_get_mem_op_value_cpu(&mem_trans->s), 
mem_trans->s.size, (mem_trans->s.physical_address & 7));

                DBG_(VVerb, SetNumeric( (FlexusIdx) theIndex) 
Addr(mem_trans->s.physical_address) ( << "  SB-entry Store @" << &std::hex << 
mem_trans->s.physical_address << '[' << &std::dec << mem_trans->s.size << "] 
aligned: " << &std::hex << aligned_addr << &std::dec << " new value: " << 
new_value) );

                //Prevent mem op from modifying memory
                  setIgnore();
                //enter the memory op into the store buffer.
                std::pair<StoreBuffer::iterator, bool> entry =
                    theStoreBuffer.insert
                      (  std::make_pair
                            ( aligned_addr
                            , StoreBufferEntry( new_value )
                            )
                      );

                //Already had an entry, coalescing
                if (! entry.second) {
                  //Increment the outstanding store count
                  ++( entry.first->second);

                  DBG_(VVerb, SetNumeric( (FlexusIdx) theIndex ) ( << "  
coalescing with existing entry" ) );

                  //Change the new value
                  entry.first->second.theNewValue.set( 
SIM_get_mem_op_value_cpu(&mem_trans->s), mem_trans->s.size, 
(mem_trans->s.physical_address & 7) ) ;
                }

                //Attach the store buffer to the instruction
                inst.setStoreBuffer(&theStoreBuffer);

                //Remember the data and size of the store, so we can perform
                //it later
                inst.setData(new_value);
              }

              //Remember the size of the load or store
              inst.setSize(mem_trans->s.size);

              #if FLEXUS_TARGET_IS(v9)
                if (mem_trans->priv) {
                  //Mark privileged operations
                  inst.setPriv();
                }
              #elif FLEXUS_TARGET_IS(x86)
                if (mem_trans->mode == Sim_CPU_Mode_Supervisor) {
                  //Mark privileged operations
                  inst.setPriv();
                }
              #endif //FLEXUS_TARGET_IS(v9)

              // add this data instruction to the consumer
              theConsumer->consumeDataOperation();
            }

          //Advance Flexus for sub-cases of Case 4
          //========================================

            if ( ! mem_trans->s.may_stall ) {
              //We complete the STD / LDD without advancing flexus

              //See if the store operation should be supressed
              if (ignore()) {

                mem_trans->s.ignore = 1;
                clearIgnore();
                DBG_( VVerb, ( << "  ignoring this op" ) );
              }

              theConsumer->simicsDone();

              enableInterrupts();
              // return zero cycle latency
              return k_no_stall;
            } else {

              #ifndef FLEXUS_FEEDER_OLD_SCHEDULING
                //Advance flexus and determine if we must stall
                theCycleManager->advanceFlexus();
                DBG_(VVerb, SetNumeric( (FlexusIdx) theIndex )
                           ( << "  Advancing flexus" ) );
                cycles_t stall_cycles = 
theCycleManager->reconcileTime(theConsumer->queueSize());

                if (stall_cycles > 0) {
                  //We must stall, so we will have an operation pending.  
Disable
                  //interrupts so Simics doesn't change the operation on us
                  disableInterrupts();
                  return stall_cycles;
                } else {
                  //No need to stall
                  if(theConsumer->isComplete()) {
                    //We finished the pending operation

                    //See if the store operation should be supressed
                    if (ignore()) {
                      mem_trans->s.ignore = 1;
                      DBG_Assert( ( mem_trans->s.type == Sim_Trans_Store )    );
                      clearIgnore();
                      DBG_( VVerb, ( << "  ignoring this op" ) );
                    }

                    //Indicate that we are done
                    theConsumer->simicsDone();

                    //Allow interrupts again.
                    enableInterrupts();
                    return k_no_stall;
                  } else {
                    //We must stall, so we will have an operation pending.  
Disable
                    //interrupts so Simics doesn't change the operation on us
                    disableInterrupts();
                    return k_call_me_next_cycle;  //Call back next cycle
                  }
                }
              #else //defined(FLEXUS_FEEDER_OLD_SCHEDULING)
                disableInterrupts();
                return k_call_me_next_cycle;  //Call back next cycle
              #endif //FLEXUS_FEEDER_OLD_SCHEDULING
            }
        }
      //Case 5: We have something that is neither instruction, nor data. We 
simply
      //complete these without taking any action in flexus
      
//================================================================================
          //Neither instruction nor data access.  We will ignore this.
          /* shouldn't happen? */
          DBG_(Crit, SetNumeric( (FlexusIdx) theIndex) ( << "Case 5: Neither 
instruction nor data" ) );

          clearIgnore();
          enableInterrupts();

          return k_no_stall;
    }


    cycles_t trace_snoop_operate(conf_object_t *space, map_list_t *map, 
generic_transaction_t *aMemTrans) {

      memory_transaction_t * mem_trans = reinterpret_cast<memory_transaction_t 
*>(aMemTrans);

      if (Simics::API::SIM_mem_op_is_data(&mem_trans->s) && (mem_trans->s.size 
<= 8) ) {
        //We do not snoop PREFETCH and block load/store operations

        //Obtain the word-aligned address
        PhysicalMemoryAddress aligned_addr(mem_trans->s.physical_address & 
~7LL);

        DBG_(VVerb, Condition(Simics::API::SIM_mem_op_is_write(&mem_trans->s))
                   SetNumeric( (FlexusIdx) theIndex)
                   (  << "Snoop interface Write @"
                      << &std::hex  << mem_trans->s.physical_address
                      << " aligned: " << aligned_addr << &std::dec
                   )
            );

        PhysicalMemoryAddress interesting_region(mem_trans->s.physical_address 
& ~0xFFLL);

        //If this transaction is a store, assert that we have a store buffer 
enty
        //for it
        if (! Simics::API::SIM_mem_op_is_write(&mem_trans->s)) {
          //Assert that Simics got what we think is the new value

          DoubleWord value_according_to_sb(getMemoryValue(aligned_addr));
          DBG_(VVerb, SetNumeric( (FlexusIdx) theIndex) ( << "    Value 
including SB contents: " << value_according_to_sb ) );

          //The correct value is in the store buffer
          switch( mem_trans->s.size) {
            case 1:
              {
                  unsigned char our_value = 
value_according_to_sb.getByte(mem_trans->s.physical_address & 7);
                  SIM_set_mem_op_value_cpu(&mem_trans->s, our_value);
                  DBG_(VVerb, SetNumeric( (FlexusIdx) theIndex) ( << " snoop 
value: " << &std::hex << (unsigned int)our_value) );
              }
              break;
            case 2:
              {
                  unsigned short our_value = 
value_according_to_sb.getHalfWord(mem_trans->s.physical_address & 7);
                  SIM_set_mem_op_value_cpu(&mem_trans->s, our_value);
                  DBG_(VVerb, SetNumeric( (FlexusIdx) theIndex) ( << " snoop 
value: " << &std::hex << our_value ) );
              }
              break;
            case 4:
              {
                  unsigned long our_value = 
value_according_to_sb.getWord(mem_trans->s.physical_address & 7);
                  SIM_set_mem_op_value_cpu(&mem_trans->s, our_value);
                  DBG_(VVerb, SetNumeric( (FlexusIdx) theIndex) ( << " snoop 
value: " << &std::hex << our_value ) );
              }
              break;
            case 8:
              {
                  unsigned long long our_value = 
value_according_to_sb.getDoubleWord(mem_trans->s.physical_address & 7);
                  SIM_set_mem_op_value_cpu(&mem_trans->s, our_value);
                  DBG_(VVerb, SetNumeric( (FlexusIdx) theIndex) ( << " snoop 
value: " << &std::hex << our_value ) );
              }
              break;
            default:
              DBG_Assert( false, SetNumeric( (FlexusIdx) theIndex) ( << 
"Unsupported memory transaction size: " << mem_trans->s.size) );
          }

        }
      }

      return 0;
    }
  };  // class SimicsTracerImpl


class SimicsTracer : public Simics::AddInObject <SimicsTracerImpl> {
    typedef Simics::AddInObject<SimicsTracerImpl> base;
   public:
    static const Simics::Persistence  class_persistence = Simics::Session;
    static std::string className() { return "InOrderFeeder"; }
    static std::string classDescription() { return "Flexus's In-order 
instruction feeder."; }

    SimicsTracer() : base() { }
    SimicsTracer(Simics::API::conf_object_t * aSimicsObject) : 
base(aSimicsObject) {}
    SimicsTracer(SimicsTracerImpl * anImpl) : base(anImpl) {}
};


}
From shanlu at cs.uiuc.edu  Tue Oct 18 22:56:33 2005
From: shanlu at cs.uiuc.edu (shan)
List-Post: [email protected]
Date: Wed Oct 19 09:11:16 2005
Subject: [Simflex] RE: x86-multithread on SimFlex error
In-Reply-To: 
<pine.lnx.4.53l-ece.cmu.edu.0510181402180.12...@dalmore.ece.cmu.edu>
Message-ID: <[email protected]>

Hi Thomas,
  Thanks very much. I used your new file. The program runs longer but still
fails after a while. The screen output is like:
46 <flexus.cpp:240> {3276800}- Timestamp: 2005-Oct-18 21:45:08
47 <SimicsTracer.hpp:662> (<undefined>[<undefined>]) {3405134}- Assertion
failed: ((!(inst.isNOP()))) : <undefined>
***  Simics getting shaky, switching to 'safe' mode.
***  Simics (main thread) received an abort signal, probably an assertion.
<Simics is running in 'safe' mode>

  The SimicsTracer.hpp:662 seems to be operating on some atomic instruction.
I am pretty sure it is inside the pthread_create system call ...
  Do you need some other information? Maybe I can dump out what is the 'bad'
instruction's PC ...
  By the way, as for the TLB problem, I think I understand what you said.
But if that is the problem, even if you add error checking, you still can
not get the physical address, yes? how do you solve the problem?
Thanks
Shan 

-----Original Message-----
From: Thomas Wenisch [mailto:[email protected]] 
Sent: Tuesday, October 18, 2005 12:35 PM
To: shan
Cc: [email protected]
Subject: Re: x86-multithread on SimFlex error

Hi Shan,

On Tue, 18 Oct 2005, shan wrote:

> Hi,
>   I am trying the x86 module of Flexus. It works perfectly with single
> thread application, but when I try some multithreaded program (just a very
> simple toy application with pthread_create and pthread_join). There is
some
> error. The error information is like:
>
> ....
> 31 <flexus.cpp:240> {1835008}- Timestamp: 2005-Oct-18 09:57:25
> [cpu0] <address not in TLB>
> simics> c
> [cpu0] <address not in TLB>
> simics>

The address not in TLB exception is an exception that Simics raises
anytime some piece of code (i.e., Flexus) asks it to translate a logical
address to a physical address, but the translation is not available in the
TLB of the current CPU.  This situation is very unusual - it probably only
arises while the OS is manipulating TLB entries or is about to take a page
fault on an instruction reference or something similar.  However,
functions like pthread_create are more likely to create this situation.

The fact that this is causing your simulation to stop is a bug in Flexus.
Flexus uses the SIM_logical_to_physical call to get the physical PC
of instructions in a few special cases.  This call can fail if the
translation for the PC is not available in the TLB (which implies the CPU
is about to take an ITLB fault).  However, I forgot to include an error
check after the calls to SIM_logical_to_physical, so the exception ends up
propagating back to the Simics frontend, and stops the simulation.

The fix is to add an error check after the SIM_logical_to_physical calls
in components/InorderSimicsFeeder/SimicsTracer.hpp.  I have attached a
fixed version of the file to this email.  Note that I did not test it
(except to check that it compiles), as I do not have any x86 test images
handy.  We don't use x86 extensively here, which is why this bug has gone
unnoticed.

Please let me know if you continue to have problems. If so, I will help
you add a bunch of debugging messages so we can confirm if the problem is
actually what I think it is.

Regards,
-Tom Wenisch
Computer Architecture Lab
Carnegie Mellon University

>
>   It happens after the program simulated a little while, maybe right at
the
> time when the thread is created.
>   I guess I miss some multi-thread related flag... what should I do to
> simulate a multithreaded application?
>   Oh, my toy application can run on simics without flexus module loaded.
> Thanks
> Shan
>
>

Reply via email to